1. Introduction
The rapid electrification of field service and maintenance fleets introduces a new layer of operational complexity that classical routing models are ill-equipped to handle jointly: shift-length compliance, fairness in overtime exposure across consecutive shifts, and the hard feasibility constraint imposed by the finite energy capacity of electric vehicles (EVs). Field service organisations increasingly operate over multiple consecutive shifts while trying to control service quality, labour costs, and compliance with safety constraints [
1,
2,
3]. When planning across multiple shifts, decisions made early in the schedule carry forward and affect later periods: routing, task ordering, and time buffers interact across shift boundaries in ways that single-period models do not capture [
4]. The issue is not merely conceptual—overtime clustered at or near shift changes create practical problems for workforce fatigue, safety, and regulatory compliance [
5].
The adoption of electric vehicles in municipal maintenance and service operations is accelerating under environmental regulation and fleet-electrification mandates [
6]. However, EVs impose a battery–range feasibility constraint that has no analogue in conventional vehicle routing: the energy consumed along a route depends on load, speed profile, road gradient, and auxiliary systems, and must remain within the vehicle’s state-of-charge (SoC) limits at all times. When routes span multiple work shifts, the interaction between shift boundaries, partial recharging opportunities, and residual SoC creates a tightly coupled scheduling problem that the existing multi-shift and E-VRP formulations address only separately [
7,
8].
Uncertainty compounds the planning difficulty. Travel times vary with traffic, weather, and local obstructions; service durations depend on asset conditions, accessibility, and task complexity [
9,
10]. For EVs, energy consumption is further affected by uncertain traffic speed profiles and ambient temperature, making deterministic energy-budget constraints particularly unreliable [
11]. Deterministic formulations, which assume exact knowledge of these quantities, therefore offer a poor fit for many field-service environments [
12]. Stochastic modelling is a natural alternative when abundant, high-quality historical data exist [
13]. In many municipal and specialised-service settings, however, such data are scarce or unreliable, and practitioners more often rely on ranges, rough likelihoods, and expert judgement rather than well-founded frequency estimates [
14].
This type of imprecise, qualitative knowledge is well matched to possibilistic (fuzzy) representations: membership functions and possibility distributions encode what experts can say confidently—upper and lower bounds, plausible modes—without forcing an arbitrary probabilistic law [
15,
16,
17]. Possibility theory, therefore, offers a practical middle ground: it keeps models tractable for optimisation while better reflecting the form of available information in many applied contexts [
18,
19]. In the EV context, modelling both travel times and energy consumption rates as triangular fuzzy numbers captures the expert knowledge about variability without requiring the statistical infrastructure that stochastic models demand.
A separate but related concern is how overtime risk is distributed over shifts. Minimizing a global metric, such as total makespan, can hide undesirable temporal concentrations of overtime: the schedule may be efficient on average, but expose particular shifts to a high share of risk, with negative consequences for safety and fairness [
4,
5]. Aggregation operators that are sensitive to the distributional properties of per-shift risk—Ordered Weighted Averaging (OWA) being a prominent example—provide a way to formalise fairness-aware objectives and to penalise inequitable risk concentrations [
20,
21].
On the methodological front, modern machine learning methods have proven useful for routing and related combinatorial problems. Attention-based architectures and reinforcement learning have shown strong empirical performance on a range of VRP variants, particularly when they can exploit problem structure and learn reusable heuristics [
22,
23,
24,
25]. At the same time, classical metaheuristics such as Large Neighbourhood Search (LNS) remains powerful and flexible. Recent work suggests that hybrid approaches—where learned policies propose or prioritize moves, while LNS provides robust, problem-aware neighbourhood exploration—can combine the best of both worlds [
26,
27,
28].
In this paper, we bring these ideas together for the case of single-vehicle routing over multiple shifts under possibilistic uncertainty, with an explicit emphasis on battery feasibility and temporal risk equity. Concretely, we propose the
Multi-Shift Single Electric Vehicle Routing Problem under Possibilistic Uncertainty (MS-SEVRP-PU), a bi-objective formulation that balances total makespan and a fairness-sensitive aggregation of per-shift overtime credibilities via an OWA operator [
20], subject to fuzzy energy-consumption constraints that enforce SoC feasibility throughout each shift. We focus on single-vehicle instances because they isolate the core temporal, energy, and risk-allocation issues without adding the combinatorial complexity of multi-vehicle assignment; such instances are also directly relevant to many municipal scenarios (e.g., inspection or maintenance units operating independently) [
1].
We organise the study around the following four research questions:
RQ1—Formulation: How can possibilistic uncertainty, battery–feasibility constraints, and fairness-aware overtime risk be jointly represented in a multi-shift electric vehicle routing model?
RQ2—Energy modelling: How can fuzzy energy consumption rates be integrated into a closed-form credibility framework without sacrificing computational tractability?
RQ3—Solution method: Can a preference-conditioned deep model be effectively combined with risk-aware and battery-conscious local search to boost Pareto front quality over traditional baselines?
RQ4—Practical impact: What trade-offs appear between operational efficiency (makespan), temporal risk distribution, and energy feasibility in practical metropolitan maintenance scenarios?
The main contributions of this work are as follows:
Modelling: MS-SEVRP-PU, a multi-shift single-electric-vehicle routing model that combines possibilistic uncertainty with temporal fairness (OWA-based) and fuzzy state-of-charge feasibility constraints.
Analytical framework: Closed-form formulations for overtime credibility and energy-budget credibility under triangular fuzzy parameters, which avoid Monte Carlo sampling and are amenable to gradient-based computations.
Algorithm: PCT-RABLNS (Pareto-Conditioned Transformer with Risk-Aware and Battery-Conscious Large Neighbourhood Search), a hybrid solver that couples a preference-conditioned transformer for constructive guidance with LNS operators that jointly target shift-boundary violations, high overtime-risk segments, and low state-of-charge situations.
Empirical findings: On municipal electric maintenance benchmarks, PCT-RABLNS attains hypervolume gains of approximately 2–5% over evolutionary baselines, reduces maximum per-shift overtime risk by roughly 15–25% (with makespan overheads of about 1–3%), and ensures full battery feasibility without significant range-anxiety penalties.
Statistical validation and practice: Results are supported by bootstrap confidence intervals, nonparametric tests, and Vargha–Delaney effect sizes (). Actionable recommendations show that fairness-oriented objectives can markedly reduce risk concentration, especially when location clustering or operational stress is present, and that battery-aware neighbourhood search is essential to avoid post hoc infeasibility in EV contexts.
In short, the paper argues that (i) possibilistic uncertainty better reflects the limited but structured knowledge available to many field operators, including uncertain energy consumption in EV fleets; (ii) explicit aggregation of temporal risk is necessary to avoid concentrated overtime exposure; and (iii) hybrid learning-augmented search methods that incorporate battery–feasibility awareness provide an effective computational route to balance efficiency, equity, and range safety in multi-shift electric vehicle routing. The motivation is practical: tighter environmental regulation, growing concern for worker safety, and the operational peculiarities of electric fleets make it important to control not only how long routes take and how overtime risk accumulates across shifts, but also whether the vehicle’s battery can sustain the planned operations throughout each shift [
4,
5,
7].
6. Discussion
Our experimental results show that PCT-RABLNS consistently surpasses established baselines across all tested instance families. Hypervolume improves by roughly 2–5%, the maximum per-shift overtime risk decreases by 15–25%, and the SoC compliance rate reaches 1.00 on every case, while the overall makespan increases only modestly, by 1–3%. These findings suggest that integrating preference-conditioned learning with a risk-aware and battery-conscious search strategy effectively balances efficiency, fairness, and energy feasibility when operating times and consumption rates are represented through possibilistic uncertainty.
The case studies reveal an important structural pattern: explicitly modelling fairness through OWA aggregation alters the architecture of Pareto-optimal solutions compared to conventional makespan-focused objectives. Without fairness considerations, both evolutionary and learning-based methods tend to concentrate overtime in later shifts—a “risk dumping” phenomenon that minimises makespan by front-loading efficient tasks but pushes compliance risk onto trailing shifts. When OWA weights penalise this concentration, solutions shift toward more balanced distributions, reducing maximum shift risk by 15–25% while incurring only a 1–3% makespan increase. In contexts such as municipal electric maintenance, where regulatory compliance, driver fatigue, and vehicle availability interact, this represents a meaningful improvement in equity without compromising operational throughput [
5,
33].
The introduction of fuzzy energy consumption rates adds a second dimension of uncertainty that prior multi-shift routing models neglect. Encoding arc energy as TFNs allows planners to express optimistic, most-likely, and pessimistic energy budgets without requiring historical consumption logs, which are rarely available for recently electrified fleets. The SoC credibility constraint provides a calibrated safety margin: at , only 10% of the possibilistic mass exceeds the battery limit, analogous to a 90th-percentile reserve in stochastic models but without requiring a probability distribution. The penalty in the scalarised objective shapes the constructive policy to avoid generating solutions that the SoC-critical destroy operator would have to fix in a later LNS step, thereby speeding convergence on energy-constrained cases (C2, C3).
The credibility-based approach for triangular fuzzy numbers provides a practical bridge between expert knowledge and quantitative optimisation. Unlike probabilistic models, which require large historical datasets to estimate distributions, TFN representations naturally accommodate lower–modal–upper assessments commonly elicited in planning practice; for travel-time and energy quantities in this study, lower values are operationally optimistic and upper values pessimistic [
18,
36]. Evaluating the credibility function—both for overtime and for energy consumption—is computationally inexpensive within heuristic search, avoids Monte Carlo sampling, and remains differentiable, which is essential for the reinforcement learning training phase where policies are evaluated millions of times. Beyond electric vehicle routing, this framework applies to any domain where uncertainty arises primarily from expert judgement, including project scheduling, resource allocation, and early-stage facility design.
Training a single policy conditioned on preference vectors allows PCT-RABLNS to bypass the computational cost of multiple independent optimisation runs or the large populations typical of evolutionary algorithms. The transformer’s attention layers capture regularities in high-quality Pareto-optimal solutions and adapt construction to the region of the trade-off space specified by . The SoC-aware feasibility mask embedded in the policy head ensures that the constructive phase never proposes actions that would violate the hard battery budget at the modal energy level, propagating energy awareness into every autoregressive decoding step. This produces an initial archive of guaranteed SoC-feasible solutions from which the LNS starts, explaining why PCT-RABLNS achieves a compliance rate of 1.00, whereas the baselines accumulate infeasible solutions (IBEA: 0.83–0.96, Neural-LNS: 0.89–0.98 across cases).
The four destroy operators—boundary-aware, risk-proportional, SoC-critical, and cluster-based—leverage structural features of multi-shift electric vehicle routing that general-purpose metaheuristics typically overlook.The boundary-aware destroy operator focuses on tasks near shift transitions, where small changes disproportionately affect overtime credibility. Risk-proportional destroy removes tasks preferentially from high-risk shifts, directly promoting fairness. SoC-critical destroy targets shifts with insufficient battery margin, redistributing energy-intensive tasks into lower-consumption periods and enabling the repair policy to identify more efficient charge profiles. The cluster-based destroy operator enables large-scale spatial rebalancing in instances with geographical grouping (C2, C3). Our ablation study (
Section 5.7) shows that the SoC-critical destroy operator is the primary driver of battery compliance (
SoC rate
when removed) while having a modest effect on the Pareto geometry (
HV
), confirming its role as a specialised feasibility–repair mechanism rather than a general-purpose improvement operator. The adaptive operator selection identifies which operators contribute most effectively across instance types: cluster-based destroy dominates on spatially clustered instances (C2, C3), risk-proportional destroy is essential under fairness stress (C4), and SoC-critical destroy is most active in C3 where the battery-to-demand ratio is tightest.
By combining learned construction with classical local search, PCT-RABLNS consistently outperforms approaches based purely on learning or purely on metaheuristics. The preference-conditioned transformer generates energy-feasible and risk-balanced starting solutions by extracting patterns from thousands of training instances, while the LNS component systematically explores neighbourhoods and applies SA acceptance rules to refine solutions. This interplay is particularly important for complex instances (C2–C4), where learning uncovers structural regularities—including recurring relationships between spatial clusters, shift boundaries, and energy consumption—but achieving high-quality solutions still requires thorough local optimisation.
It is useful to summarise the role and the empirical contribution of each architectural element, since the framework combines several modelling and algorithmic ingredients. The
possibilistic representation (TFNs for travel, service and energy) is the modelling block that turns expert assessments into a tractable mathematical object; without it, the formulation collapses to a deterministic E-VRP that cannot capture the imprecision typical of recently electrified municipal fleets. The
closed-form credibility evaluation is the analytical block that makes both the overtime and the SoC constraints differentiable and inexpensive to evaluate, which, in turn, enables the gradient-based RL training and the high evaluation throughput observed in
Table A6. The
OWA fairness objective is the decision-theoretic block: the ablation in
Table 6 shows that removing it inflates the maximum shift risk by 42% with only a 1.9% makespan saving, confirming that fairness is not implicit in efficient routing. The
preference-conditioned transformer is the constructive block: a single trained policy covers the entire Pareto spectrum, and its ablation costs
HV. The
four destroy operators are the refinement block: the boundary-aware, risk-proportional and cluster-based operators jointly improve Pareto geometry, while the SoC-critical operator is the dedicated feasibility–repair mechanism whose removal alone causes the largest drop in SoC compliance (
). Read together, the components are synergistic rather than redundant: each one moves a different metric (HV, max risk, SoC rate or convergence time) and the full pipeline is needed to obtain simultaneous gains in all four.
A natural question is whether the additional complexity of preference-conditioned reinforcement learning is worth its cost compared with simpler heuristic baselines. Our experiments allow a direct trade-off assessment. On the methodological side, PCT-RABLNS introduces a one-time training cost of approximately 60 GPU-hours on an NVIDIA A100, additional GPU memory of about 3.4 GB at inference, and roughly 200 s of GPU time per 1200 s run on case C3 (
Table A6). On the performance side, it delivers 2–5% higher hypervolume, 15–25% lower maximum shift risk, full SoC compliance, and a 16–28% reduction in time-to-90%-HV against an indicator-based evolutionary algorithm (IBEA) and a fixed-weight neural LNS that share the same wall-clock budget and the same family of destroy operators. Once the policy is trained, inference and LNS refinement complete within the 5–20 min planning horizon typical of municipal day-ahead scheduling, so the operational gain—fairer schedules with guaranteed battery feasibility—comes at no extra runtime cost. The trade-off is therefore favourable when the planning problem is recurrent (training cost is amortised over many planning days), when fairness and battery feasibility are first-class constraints, and when expert-elicited TFNs are available. For ad hoc single-instance planning with unrestricted runtime and without fairness or SoC requirements, a well-tuned IBEA or even a strong metaheuristic without learning would remain a sensible, simpler choice.
The generalisation capability of PCT-RABLNS depends on the diversity of the training data. Our dataset covers instance sizes , shift counts , spatial layouts (uniform and clustered), uncertainty levels (), and energy spreads (). Performance on the held-out case studies C1–C4, which share but do not exactly match the training distributions, demonstrates good generalisation. For problem variants outside this scope, fine-tuning provides a practical adaptation strategy without full retraining.
The experimental protocol was designed for robustness: all methods operated under identical computational budgets, each configuration was repeated with ten random seeds, and non-parametric statistical tests with Holm–Bonferroni correction were applied. Effect sizes quantify the practical relevance of observed differences. For medium and large instances (C2 and C3, ), effect sizes are substantial, highlighting the operational significance of the improvements. From a practical standpoint, PCT-RABLNS offers a favourable computational profile: training is a one-time cost of approximately 60 GPU-hours on an NVIDIA A100 40 GB, whereas inference takes seconds and LNS refinement completes within typical planning horizons (5–20 min depending on instance size). This fits naturally with day-ahead or shift-level scheduling in municipal electric maintenance operations.
Several limitations remain. Our work focuses on a single-vehicle setting; multi-vehicle extensions require coordinated workload and energy balancing across both vehicles and shifts. The single-vehicle scope was deliberate: it isolates the coupling between multi-shift segmentation, fairness in overtime exposure and battery feasibility, which is the core methodological contribution of this paper. It also matches a common operational pattern in municipal service, in which inspection or maintenance units operate independently and are planned one vehicle at a time. The framework, however, lends itself to a multi-vehicle extension along three lines. First, the solution encoding
generalises to a vector
of per-vehicle sequences, with an additional assignment step (each task to one vehicle) that can be performed by a separate routing head of the transformer or by a clustered warm start. Second, the OWA fairness objective extends naturally from “risk per shift of one vehicle” to “risk per (vehicle, shift) cell”, preserving the closed-form credibility expression and only increasing the length of the sorted risk vector. Third, the four destroy operators remain meaningful: the boundary-aware and risk-proportional operators generalise per-vehicle, the SoC-critical operator extends to inter-vehicle task swaps that move energy-intensive tasks to vehicles with more residual SoC, and a new cross-vehicle exchange operator can be added to share workload. The main computational implication is that the transformer must encode a (vehicle, shift) context vector instead of a single shift index, which we expect to add proportionally to the training cost but not to change the qualitative behaviour observed in the single-vehicle case. We mark this as the most important direction for future work. The benchmarks are synthetic approximations of realistic instances—in the sense that the network sizes, shift structures, battery capacities, TFN spreads and uncertainty asymmetries reproduce parameters drawn from anonymised, aggregated planning records of an operational electrified municipal maintenance unit (see the Acknowledgments and
uncertainty_parameters.json [
37]) rather than from a real-world routing instance reproduced node-for-node, and validation using audited operational logs from an electrified municipal fleet would strengthen external validity. We assume a static task set and a full recharge at each depot return; real-time arrivals and partial opportunity charging call for online learning and anytime algorithms. The possibilistic quantification simplifies uncertainty modelling by avoiding full probability distributions, but hybrid probabilistic–possibilistic approaches could blend expert input with partial historical data when both sources are available. OWA is one route to expressing fairness: lexicographic rules or constraint-based formulations may be more appropriate in contexts where regulatory thresholds are hard rather than soft. Nevertheless, the integration of possibilistic uncertainty, OWA-driven fairness, SoC-credibility constraints, and preference-conditioned learning constitutes a coherent framework for fair and energy-safe optimisation under ambiguity. The experimental results demonstrate improvements in all four dimensions simultaneously—Pareto quality, fairness, battery feasibility, and convergence speed—with statistical guarantees and operationally meaningful effect sizes.
7. Conclusions
This paper has introduced a unified framework for multi-shift, single-vehicle electric service routing in which fairness in the distribution of overtime risk and battery–range feasibility are treated as first-class objectives rather than afterthoughts. The MS-SEVRP-PU formulation brings together, in a single bi-objective programme, three ingredients that prior work has only addressed in isolation: a possibilistic representation of imprecise travel times, service durations and arc/on-site energy consumption through triangular fuzzy numbers; a closed-form credibility evaluation that turns both overtime and state-of-charge constraints into differentiable, computationally inexpensive expressions; and an Ordered Weighted Averaging aggregation of per-shift overtime credibilities that explicitly penalises inequitable risk concentrations (answer to RQ1). Together, these elements show that imprecise expert knowledge—typical of recently electrified municipal fleets—can be encoded without forcing a probabilistic law, and that the resulting fuzzy quantities can be embedded in a tractable optimisation model that admits gradient-based reasoning, with per-solution evaluation cost against – for an equivalent Monte Carlo estimate (answer to RQ2).
On the algorithmic side, the proposed PCT-RABLNS solver demonstrates that learning and classical local search are complementary rather than substitutes (answer to RQ3). A single Pareto-Conditioned Transformer, trained via multi-objective reinforcement learning with uniform preference sampling, generates high-quality and SoC-feasible initial solutions across the entire trade-off spectrum, while a Risk-Aware and Battery-Conscious Large Neighbourhood Search refines them through four specialised destroy operators—boundary-aware, risk-proportional, SoC-critical and cluster-based—selected adaptively. The ablation study confirms that each component plays a distinct, non-redundant role: preference conditioning is the main driver of Pareto-front quality, the OWA objective is the main driver of fairness, and the SoC-critical destroy operator is the main driver of battery compliance. None of these gains is achievable by any single component in isolation.
The empirical evidence supports the practical relevance of the framework (answer to RQ4). On the four calibrated municipal–maintenance case studies, PCT-RABLNS improves hypervolume by 2–5% over strong evolutionary and learning baselines, reduces the maximum per-shift overtime risk by 15–25% and the Gini coefficient by 12–30%, and reaches 90% of its final hypervolume 16–28% faster than IBEA and 8–19% faster than Neural-LNS. Crucially, every archived solution satisfies the SoC credibility constraint for all shifts, against compliance rates as low as 0.83 for IBEA and 0.89 for Neural-LNS on the energy-intensive cases. These improvements are obtained at a marginal makespan overhead of only 1–3%, are statistically significant under nonparametric tests with Holm–Bonferroni correction, and exhibit large Vargha–Delaney effect sizes on medium and large instances. The operational reading of these numbers is that fairness in overtime exposure and guaranteed battery feasibility—two requirements that municipal operators increasingly have to reconcile with regulatory and labour constraints—can be attained without sacrificing throughput, and within the 5–20 min planning horizon typical of day-ahead and shift-level scheduling.
These results should be interpreted in light of the study design. The benchmarks are calibrated synthetic instances whose parameters were elicited from the operations engineers of an industrial partner; while they reproduce the operating regimes of an electrified municipal maintenance unit, validation on audited field logs from a multi-vehicle fleet remains an important next step. The work also assumes a static task set and a full recharge at each depot return; real-time arrivals and partial opportunity charging would call for online learning and anytime variants of the solver. The single-vehicle scope was a deliberate methodological choice—it isolates the coupling between multi-shift segmentation, fairness and battery feasibility—but the encoding, the OWA aggregation and the four destroy operators all admit natural extensions to a multi-vehicle setting, as discussed in
Section 6.
Beyond electric vehicle routing, the methodological building blocks generalise to a broader class of fairness-sensitive decision problems under imprecise information. OWA-based risk aggregation transfers directly to workforce scheduling, resource allocation and service network design whenever equitable distribution across periods or agents is a goal in itself. The SoC credibility constraint is applicable to any battery-operated asset scheduling problem. Preference-conditioned policy learning extends to other multi-objective combinatorial problems where the Pareto front must be approximated by a single trained model. Analytical possibilistic evaluation provides a reusable bridge between expert knowledge and quantitative optimisation when historical data are scarce or unreliable.
Promising directions for future work include the multi-vehicle EV extension with joint workload and charging coordination; dynamic task arrivals handled by online replanning; alternative fairness operators such as lexicographic and Nash-based objectives; hybrid probabilistic–possibilistic uncertainty models that blend expert input with partial historical data; and real-world validation with operational logs from electrified municipal fleets.