1. Introduction
With the world moving towards sustainability, companies are increasingly aware of their environmental footprint and the operational costs it entails. The desire to be “greener” is no longer simply about climate change but now has real strategic benefits for businesses in terms of both regulatory compliance and competitive advantage [
1]. Among all sectors, transportation and logistics remain pivotal to global economic activity; nevertheless, their environmental impact, operational complexity, and exposure to real-world constraints such as fluctuating demand and urban mobility challenges have driven significant academic interest in optimizing vehicle tours and routing strategies.
The swift expansion of e-mobility presents new technological challenges for the optimization of delivery operations, particularly those related to electric vehicle battery performance and range limitations. These characteristics drive researchers and practitioners to reevaluate traditional routing models, modifying them to enhance economic efficiency while also addressing sustainability objectives and infrastructural constraints, particularly in developing markets. Despite these obstacles, managing logistics to achieve a balance among efficiency, feasibility, and environmental impact is regarded as vital to sustainable urban growth and business resilience.
Recent metaheuristics for routing problems often rely on classical neighborhoods such as 2-opt, swap and relocate [
2], and some works further extend EVRPs with clustered customers and multi-echelon distribution structures [
3]. In contrast, this study investigates how modeling and algorithmic choices shape electric vehicle routing performance in a short-distance urban context. The approach introduces a construction–repair SA framework with differentiated penalties for lateness, unserved customers and capacity violations, together with a state-dependent charging policy that adapts recharge duration to the current state of charge rather than relying on a fixed charging time. Finally, the optimized routes are validated through a two-layer workflow that links the planner to microscopic SUMO/TraCI simulation, quantifying the gap between abstract routing metrics and network-level distance, travel time and energy outcomes on the real Marrakesh network.
The remainder of this paper is organized as follows:
Section 2 summarizes the literature on sustainable vehicle routing, time-window constraints, and metaheuristic solution techniques.
Section 3 describes the mathematical formulation of the routing problem studied, emphasizing the use of realistic operational constraints and penalty functions.
Section 4 presents the calibration tests conducted.
Section 5 presents and discusses the computational results applied to the context of Marrakesh, Morocco, as well as a comparison with microscopic simulation.
2. Related Work
Many metaheuristics have greatly improved the accuracy and applicability of solutions for the vehicle routing problem (VRP) in previous decades by enabling the exploration of vast and intricate search spaces that were challenging for exact algorithms to traverse. Consequently, they contributed to the development of logistics and transportation planning by providing better, often almost-optimal solutions to the vehicle routing problem (VRP) before domain-specific variants such as the Green Vehicle Routing Problem (GVRP) and the electric vehicle routing problem (EVRP) emerged after 2000.
Researchers such as Paolo Toth [
4], Daniele Vigo, and Gilbert Laporte [
5] leveraged the foundational metaheuristic techniques developed in the 1990s to advance solutions for larger and more complex VRP instances, integrating practical constraints and novel problem characteristics that more accurately represented real-world scenarios. Consequently, the fundamental algorithmic advancements of the 1990s facilitated the extensive diversification and methodological refinement of VRP research in the early 2000s.
The significant methodological advances in VRP variants between 2000 and 2010 such as the DVRP, PVRP, VRPB, MDVRP, and LRP laid the foundation for addressing increasingly complex and realistic logistics scenarios. Notably the Dynamic Vehicle Routing Problem (DVRP), which entails the real-time emergence of service demands necessitating route adjustments. In this context, Psaraftis [
6] provides a comprehensive analysis of DVRP research, categorizing dynamic variants and assessing methodologies for handling real-time service requests and route modifications.
The Periodic Vehicle Routing Problem (PVRP) imposes constraints on visitation frequencies across designated time intervals indicated within that timeframe. For example, Francis [
7] provides a comprehensive analysis of the PVRP, emphasizing scenarios in which customers necessitate multiple visits during a planning horizon. It encapsulates essential models, algorithms, and application domains, while addressing extensions beyond the fundamental framework.
The Vehicle Routing Problem with Backhauls (VRPB) extends the classical VRP by allowing each route to include both deliveries and pickups. In a seminal study, Goetschalckx and Jacobs-Blecha [
8] analyze routes where vehicles first complete all linehaul deliveries and then perform backhaul pickups before returning to the depot.
The Multi-Depot Vehicle Routing Problem extends the classical VRP by allowing several depots to serve customers, while the Location Routing Problem (LRP) combines depot siting and route design in a single framework. Cordeau [
9] develops tabu search algorithms for complex MDVRP-type settings, showing how unified metaheuristics can handle multiple depots and rich side constraints, and Laporte, Nobert and Arpin [
10] study LRPs that jointly choose facility locations and vehicle routes, reflecting the integrated network-design decisions found in real supply chains.
From 2010 to 2014, research on the Green Vehicle Routing Problem (GVRP) intensified, emphasizing the reduction of environmental consequences alongside the preservation of operational efficiency.
This period’s significant contribution is the development of mathematical models for GVRP that explicitly account for routes of alternative fuel vehicles, including refueling/recharging stations, focusing on minimizing travel distances and CO
2 emissions while taking into account driving range constraints. Key contributions include the pollution-routing framework of Bektaş and Laporte [
11], and the first GVRP model with alternative-fuel vehicles and refueling stations by Erdogan and Miller-Hooks [
12], which explicitly accounts for fuel consumption, emissions and refueling infrastructure. Subsequent studies refined emission and fuel-use models, often using ALNS and other metaheuristics, and surveyed green routing and scheduling problems with a strong multi-objective perspective [
13,
14,
15]. This stream of work shows how routing decisions can be evaluated not only on cost and service quality but also on energy efficiency and environmental impact, providing the broader context in which EVRPTW models are now being developed.
Felipe [
16] introduced the Green Vehicle Routing Problem with Partial Recharging, which combines three different charging technologies with partial recharging and varying charging velocities. They highlighted the operational advantages of partial recharge strategies in electric vehicle routing and suggested solution methodologies that combine simulated annealing (SA) and local search (LS).
Erdogan and Miller-Hooks [
17] introduced an innovative GVRP model that enables vehicles to recharge or refuel during transit, recognizing the inadequate infrastructure of Alternative Fuel Stations (AFS) and the necessity to prevent vehicle stranding. This model included service time, customer demands, and route duration constraints, serving as a foundational framework for subsequent extensions like the Electric Vehicle Routing Problem with Time Windows and Recharging Stations (E-VRPTW) created by Schneider, Stenger, and Goeke [
18], which incorporated time-window constraints and heuristic solutions that merged Variable Neighborhood Search with tabu search [
9].
By examining operating costs and performance indicators, Davis and Figliozzi [
19] created a methodology to evaluate the competitiveness of electric delivery trucks. While ref. [
20] offered strategies for internalizing the environmental costs associated with logistics systems, Sinha and Labi [
21] developed a thorough framework for the evaluation and planning of transportation projects. An updated manual for calculating external transport costs across various modalities was presented by Korzhenevych et al. [
22]. In order to evaluate the operational feasibility of more ecologically friendly fleet configurations, Juan, Goentzel, and Bektaş [
23] also looked into the routing of fleets with varying driving ranges.
EVRPTW research has focused on modeling range limits, charging operations and their interaction with routing. Early work such as Schneider et al. and Keskin and Çatay [
18,
24] integrated recharging stations, partial recharging strategies and battery models into heuristic EVRPTW formulations, while later studies explored more complex charging technologies, including multiple chargers and piecewise-linear charging functions [
25]. Other contributions examined how charging infrastructure and grid constraints affect feasible routing plans, from battery swapping and swap-station location decisions [
26,
27] to the impact of slow/fast charging allocation and power-network limits on fleet operation [
28,
29]. Together, these works underline that realistic EV routing models must jointly consider vehicle energy consumption, charging technology and infrastructure availability.
Building on these general contributions, recent work has focused more specifically on electric vehicle routing problems (EVRPs) with realistic charging and energy models. For example, Yang et al. [
2] propose an EVRP with a flexible charging strategy based on real-time charging functions and solve it with an evolutionary algorithm that combines ant colony search, variable-neighborhood operators and reinforcement-learning-based dimensionality reduction, showing clear performance gains on large Solomon-type instances.
Latorre-Biel [
3] proposes a hybrid genetic search (HGS) for the generalized VRP that manages separate feasible and infeasible sub-populations to balance intensification and diversity, represents solutions with multiple chromosomes (giant tour, route and node-level), reconstructs routes via ordered crossover and a SPLIT procedure, and applies an intensive local search phase with relocation, swap and 2-opt moves at the cluster level to reach or improve many best-known solutions on large benchmarks.
Recent studies have extended classical EVRPTW models by combining customer clustering with multi-echelon logistics structures. Wu and Bao [
30] address the two-echelon vehicle routing problem with clustered customers (2E-VRP-CC), where freight is transported from a depot to satellites before being delivered to customer clusters that must be served consecutively by the same second-echelon vehicle. To solve the problem, they formulate a MILP model and develop an adaptive large neighborhood search (ALNS) framework featuring tailored destroy-and-repair operators for cluster and satellite management.
In summary, the literature on vehicle routing has progressed from initial applications of metaheuristics in conventional fleet routing issues to the formulation of advanced models such as the electric vehicle routing problem (EVRP) and the Green Vehicle Routing Problem (GVRP).
This study addresses the following question: how can an EVRPTW model and a simulated annealing metaheuristic be used to plan depot-based electric last-mile deliveries with time windows in Marrakesh under realistic energy constraints? The specific objectives are: (i) to formulate an EVRPTW model with distance, lateness, and unserved-customer penalties, (ii) to design and calibrate a simulated annealing approach with greedy construction, repair, and 2-opt improvement, and (iii) to evaluate this approach on Marrakesh-based instances using realistic travel times and energy consumption. The next section first describes the problem formulation and then details the solution approach.
3. Problem Formulation
The E-VRPTW examined in this study is formally represented as a directed graph G = (N, A) (
Figure 1: Two depots are presented in the graph just for clarity), where the node set N includes the central depot, all client nodes, and some of them nodes with charging equipment, while the arc set A connects each ordered pair of nodes and conveys the corresponding travel distance and travel time. Each arc (i, j) ∈ A signifies a direct connection between two locations, a depot or client, and the routing issue entails determining a collection of vehicle tours, each characterized by a sequence of arcs crossed by vehicles, that collectively service all customers.
Figure 1 illustrates our problem, showcasing the depot, client nodes, and charging-capable sites in client locations, with the viable connections represented as directed arcs. The vehicle fleet is uniform, comprising identical electric vehicles with a nominal cargo capacity of 110 units and an effective operational limit of 99 units (90% of nominal), a decrease that implicitly considers the increased energy consumption linked to payload weight without explicitly modeling the physical weight–energy relationship.
The problem is subject to a set of operational restrictions found in the E-VRPTW literature: vehicle capacity, timeframe compliance, route continuity, energy feasibility and charging dynamics. The objective function to minimize is given by
The objective function combines three components. The first term minimizes the total traveled distance of the fleet. The second term penalizes customer lateness proportionally to the amount of delay, controlled by the coefficient λ. The third term penalizes each unserved customer using the coefficient γ. These two penalty coefficients allow the algorithm to balance distance minimization against service quality and feasibility. Low penalty values favor shorter routes even if they contain late or unserved customers, whereas high penalty values force the search toward complete and time-feasible solutions.
The model imposes the following operational constraints across all configurations:
Sets:
;
;
;
;
;
.
Parameters and decision variables:
;
;
;
;
;
;
;
;
;
;
;
;
;
M: A large constant;
;
;
;
;
;
.
This constraint ensures full-service coverage by requiring that every customer is visited exactly one time and by only one vehicle. It prevents missing or duplicated visits, guaranteeing that the demand of each client is accounted for once in the route construction. In the context of our algorithm, only routes satisfying this condition are considered feasible candidates during the search.
Flow conservation enforces route continuity: whenever a vehicle arrives at a customer, it must also depart after service. This prevents dead ends or partial paths from forming within a route. When simulated annealing generates neighborhood solutions by swapping or inserting customers, this constraint ensures that the resulting configurations still represent valid traveling paths from the depot and back.
This constraint guarantees that each vehicle starts its route at the depot and returns there after completing deliveries. It models the operational requirement that all tours begin and end at a central facility, ensuring that no route is disconnected or open-ended. This condition defines the boundary of each route and is preserved during all neighborhood modifications.
The capacity constraint limits the total demand served by any vehicle so that it does not exceed its maximum loading capacity. It regulates the feasibility of load assignments and avoids overutilizing a vehicle. During simulated annealing optimization, any solution violating this constraint is rejected.
Subtour elimination removes circular paths that do not include the depot and would otherwise trap a subset of customers in isolated loops. This is managed through additional ordering or cumulative load variables that enforce logical route progression. This constraint ensures generated solutions represent complete depot-to-depot routes connecting all assigned customers without forming disjoint cycles.
Time windows restrict service to occurring within specified intervals for each customer. Service can begin only when the vehicle arrives within this window; early arrivals may wait, and late arrivals incur a penalty. In our algorithm, violation of time windows leads to an adjustment in objective cost through lateness penalties, steering the search toward temporally feasible routes.
This constraint tracks the chronological feasibility of each route by ensuring that the start of service at one customer logically follows the completion of service at the preceding stop plus travel time. It propagates actual arrival and departure times along the route.
Lateness measures how much a delivery exceeds the upper bound of the assigned time window. It is calculated as the positive deviation between actual arrival time and the latest allowed time. Lateness serves as a key feedback component in the objective function: solutions with higher lateness values receive higher total costs, penalizing them and driving the search toward schedules that better respect customers’ delivery constraints.
This constraint ensures that the energy level of each electric vehicle never exceeds its maximum battery capacity Qu in a stop
. At the beginning of each route
, the state of charge (SoC) is set to this upper limit, and as the vehicle travels along each segment, energy is consumed proportionally to the distance covered.
This constraint models the depletion of the electric vehicle’s battery as it travels between two locations. The remaining state of charge (SoC) at the next node equals the current SoC minus the energy consumed, which is proportional to the travel distance and the energy consumption rate. It ensures that the available charge never exceeds the vehicle’s maximum battery capacity or drops below operational limits during the route.
This constraint guarantees safety and reliability by requiring that each vehicle maintains a minimum battery reserve throughout its operation. It prevents vehicles from reaching critically low charge levels that could compromise route completion. In practice, this ensures the algorithm generates only feasible paths where the remaining energy always exceeds the minimum
.
This constraint defines how vehicles recharge their batteries before and after at a particular station. The increase in SoC depends on the station’s charging rate and the time spent charging, capped by the battery’s maximum capacity. Minimum and maximum charging durations are enforced to reflect real-world station policies. During optimization, charging time is dynamically adjusted to maintain route feasibility.
This constraint ensures that after servicing a node, vehicle k has enough remaining battery energy to travel from node i to the next node j while preserving the minimum required battery reserve. The required energy is represented by Eij. Therefore, the vehicle can only travel from i to j if its state of charge after leaving node i is at least equal to the energy needed for arc (i, j) plus the minimum reserve.
The energy consumption on each arc is modeled as a function of the traveled distance and the operating speed. Following simplified vehicle dynamics, the consumption per kilometer is decomposed into a constant component and a speed-dependent component. The constant term α represents the baseline energy required to overcome rolling resistance and auxiliary loads, while the term βvop
2 captures the additional energy demand associated with aerodynamic drag, which increases with the square of vehicle speed. Therefore, the total energy required to travel from node i to node j is expressed as
The time-window structures shown in
Section 5 are derived from Solomon-type benchmark cases found in the PyVRP SDVRPTW repository [
31], a standardized reference dataset that offers time-window limits and service durations.
Our algorithm consists of several steps, starting with constructing the initial solution and inserting each client c.
Solving Approach
The simulated annealing procedure starts from a structured initial solution generated by a greedy insertion heuristic. Customers are first ranked according to a composite priority criterion that accounts for time-window urgency, demand level, and distance from the depot. The algorithm then attempts to insert each customer into the earliest feasible route position. Each tentative insertion is evaluated using the feasibility-checking procedure which verifies vehicle capacity, chronological consistency, time-window feasibility, service duration, battery consumption, and minimum state-of-charge requirements.
If a customer cannot be inserted during the first greedy pass, it is temporarily marked as unassigned. A repair phase is then applied, which explores possible insertion positions across the available routes and selects the most suitable feasible placement. In addition, repair attempts to reduce tardiness by relocating late customers to positions that minimize the induced delay. This repair mechanism strengthens the initial solution by reducing the number of infeasible elements before the simulated annealing search begins.
Finally, a 2-opt local improvement procedure is applied to each route in order to reduce unnecessary detours and improve route compactness. The 2-opt operator reverses subsequences within a route and keeps the modification only when it improves the solution while preserving feasibility. Consequently, the proposed initial solution is not random; it is a strong constructive solution designed to provide the simulated annealing algorithm with a high-quality starting point.
Figure 2 illustrates the decision logic used to build the initial solution for the electric vehicle routing problem.
After constructing the initial solution, the proposed method applies a simulated-annealing metaheuristic to improve the routing plan. The algorithm starts from the repaired and locally improved initial solution, evaluates its penalized objective value, and then iteratively generates neighboring solutions using route modification operators. At each iteration, the new solution is checked for operational feasibility, evaluated through the objective function, and either accepted or rejected according to the simulated annealing acceptance rule. This process enables the search to accept not only improving moves but also, with a controlled probability, some deteriorating moves, which helps the algorithm escape local optima, Algorithm 1 shows the overall performance.
| Algorithm 1: Overall metaheuristic algorithm |
- 1:
Initialization - 2:
Neighborhood generation: At each iteration, a neighboring solution is generated by applying one of the neighborhood operators: moving a customer, swapping two customers, or reversing a route segment. - 3:
Feasibility verification: After each modification, the affected routes are checked with respect to vehicle capacity, time-window consistency, service duration, and state-of-charge constraints. - 4:
Cost evaluation: The new solution is evaluated using the penalized objective function, which includes total travel distance, lateness penalties, and unserved-customer penalties. - 5:
Acceptance criterion: If the new solution has a lower objective value, it is accepted. Otherwise, it may still be accepted according to the simulated annealing probability or rejected based on the criterion below; after reach application, the solution and temperature are updated:
|
In this study, three main perturbation operators are used during the search: move, swap, and reverse. These operators modify the current solution by changing either the assignment of customers to vehicles or the order in which customers are visited. After each perturbation, the affected routes are checked to ensure that hard constraints, such as vehicle capacity and battery feasibility, remain satisfied. In addition, a 2-opt local improvement procedure is periodically applied to reduce unnecessary detours and improve the solution.
The
Figure 3 illustrates the core principle of the 2-opt neighborhood move, which serves as a powerful local improvement operator within route optimization algorithms. The 2-opt method works by selecting two positions on a given route and reversing the order of all clients between these positions. This simple operation can eliminate route crossings and significantly reduce total travel distance. Integrating such neighborhood moves into the metaheuristic search process enables the algorithm to escape local minima and progressively refine the routes towards increased efficiency.
The
Figure 4 demonstrates the operation of the move operator within a single vehicle’s route. In this neighborhood move, one client is selected and relocated from its original position in the sequence to a different location further along the route. This is illustrated by moving node 7 from its initial placement between 6 and 5 to a new position after 9. The move operator provides the search algorithm with the flexibility to explore alternative visit orders, which can help resolve constraint violations, reduce total route cost, or better align with time-window requirements. Such targeted adjustments are essential for escaping local minima and refining route quality during metaheuristic optimization.
The
Figure 5 illustrates the swap operator applied within the route of a single vehicle. In this neighborhood move, the positions of two clients are exchanged in the sequence, enabling the algorithm to explore different permutations without altering the rest of the route. For example, if the original sequence is 2 → 6 → 7 → 5 → 9, swapping 7 and 9 produces the sequence 2 → 6 → 9 → 5 → 7. This swap operation is useful for reorganizing service order, which may resolve violations (such as time-window overruns, load, or battery issues) or simply reduce route cost. The swap operator is a fundamental move in local search and metaheuristic optimization, providing a simple but effective means for diversifying and refining route solutions.
5. SA Solution Performance and SUMO Route Evaluation
This section reports the results obtained for the Marrakesh case study using the calibrated settings of the SA algorithm and its integration with the SUMO microscopic simulation. The SA solution is first generated on the abstract EVRP-TW model and then embedded into SUMO by replaying the planned routes on the Marrakesh Road network. The comparison focuses on distance, trip time and energy usage in order to quantify the gap between high-level planning assumptions and network-level vehicle behavior.
All customers are served and no time windows are violated in the final solution (
Figure 8, so the objective value (
Figure 9) corresponds purely to the “real” routing cost without any lateness or unserved-customer penalties.
Figure 10 and
Figure 11 represent the per-vehicle performance of the fleet in the Marrakesh case, derived alternately from the SA and the SUMO microscopic simulation. The SA charts reflect the projected route distance and the associated energy demand for each vehicle, calculated using the equivalent consumption model.
Table 6 compares the route completion timings obtained via SA against those derived from the SUMO simulation for each vehicle in the Marrakesh scenario. For the majority of vehicles, SUMO generates tours slightly extended in duration, with individual variations ranging from approximately −10% to +66% and an average escalation of about 12% relative to the SA timetable.
Table 7 reports, for the same routes, the per-vehicle distance and energy consumption predicted by the SA model and measured in SUMO. In all cases the network distance driven in SUMO is substantially larger than the abstract distance used in the SA optimization, with deviations between about 66% and 280% and an average of 181%, reflecting the detours, one-way streets and indirect connections of the real Marrakesh road network. Consistently, the electrical energy drawn in SUMO is higher than the SA prediction for every vehicle, leading to an average energy underestimation of around 74% by the simplified equivalent-consumption model; this quantifies how optimistic a Euclidean, constant-intensity EVRP formulation can be when it is transferred to a microscopic urban traffic setting.
These discrepancies are explained by the different modeling choices in the two layers. In SA, inter-customer distances come from a Euclidean-type matrix, routes are evaluated at a fixed cruising speed, and energy is computed using an averaged aerodynamic formula that yields a constant equivalent consumption per kilometer. In SUMO [
32], by contrast, routes follow the real street graph and energy is derived from detailed speed profiles that include acceleration, deceleration, stops and speed limits. As a result, SUMO generates longer paths, higher energy usage and slightly longer completion times, providing a more conservative and realistic assessment of the operational requirements of the electric fleet in Marrakesh.