Open Competency Optimization with Combinatorial Operators for the Dynamic Green Traveling Salesman Problem

Benjelloun, Rim; Tarik, Mouna; Jebari, Khalid

doi:10.3390/info16080675

Open AccessArticle

Open Competency Optimization with Combinatorial Operators for the Dynamic Green Traveling Salesman Problem

by

Rim Benjelloun

^†,

Mouna Tarik

^† and

Khalid Jebari

^*

IABL, FSTT, Abdelmalek Essaadi University, Tetouan 93000, Morocco

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Information 2025, 16(8), 675; https://doi.org/10.3390/info16080675

Submission received: 13 May 2025 / Revised: 1 August 2025 / Accepted: 4 August 2025 / Published: 7 August 2025

(This article belongs to the Section Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

This paper proposes the Open Competency Optimization (OCO) approach, based on adaptive combinatorial operators, to solve the Dynamic Green Traveling Salesman Problem (DG-TSP), which extends the classical TSP by incorporating dynamic travel conditions, realistic road gradients, and energy consumption considerations. The objective is to minimize fuel consumption and emissions by reducing the total tour length under varying conditions. Unlike conventional metaheuristics based on real-coded representations, our method directly operates on combinatorial structures, ensuring efficient adaptation without costly transformations. Embedded within a dynamic metaheuristic framework, our operators continuously refine the routing decisions in response to environmental and demand changes. Experimental assessments conducted in practical contexts reveal that our algorithm attains a tour length of 21,059, which is indicative of a 36.16% reduction in fuel consumption relative to Ant Colony Optimization (ACO) (32,994), a 4.06% decrease when compared to Grey Wolf Optimizer (GWO) (21,949), a 2.95% reduction in relation to Particle Swarm Optimization (PSO) (21,701), and a 0.90% decline when juxtaposed with Genetic Algorithm (GA) (21,251). In terms of overall offline performance, our approach achieves the best score (21,290.9), significantly outperforming ACO (36,957.6), GWO (122,881.04), GA (59,296.5), and PSO (36,744.29), confirming both solution quality and stability over time. These findings underscore the resilience and scalability of the proposed approach for sustainable logistics, presenting a pragmatic resolution to enhance transportation operations within dynamic and ecologically sensitive environments.

Keywords:

dynamic optimization; metaheuristic; competency approach; global optimization; Dynamic Green Traveling Salesman Problem

1. Introduction

The TSP is one of the most well-known and widely studied problems in combinatorial optimization. It consists of finding the shortest possible tour that allows a salesman to visit each city in a given set exactly once and return to the starting point [1]. Due to its simplicity in formulation and complexity in solution, the TSP has become a benchmark problem in various domains such as logistics, transportation planning, and manufacturing. However, real-world routing problems are rarely static. In practical applications, parameters such as travel times, and route constraints are often subject to change. This has led to the development of the dynamic TSP (DTSP), which considers real-time variations in the problem data, making the optimization process significantly more challenging. Among these dynamic extensions, particular attention has recently been given to problems that incorporate both dynamic inputs and environmental considerations. Motivated by growing concerns over climate change and sustainability, the research community has introduced the dynamic green TSP (DG-TSP) [2]. This formulation extends the DTSP by incorporating not only time-dependent travel costs influenced by road gradients and energy consumption models, but also by aiming to minimize fuel usage and greenhouse gas emissions. The DG-TSP is of critical importance in the context of sustainable logistics, where businesses must balance operational efficiency with environmental responsibility. Despite its relevance, solving the DG-TSP remains a complex task. Traditional exact methods and basic heuristics are often insufficient to cope with the stochastic nature of travel conditions—such as fluctuating traffic patterns—and the nonlinearities arising from road gradients and energy constraints. While metaheuristics have demonstrated their capacity to address complex problems, their effectiveness in dynamic and green contexts largely depends on the adaptability of their operators and mechanisms. To address these challenges, we propose a novel metaheuristic framework based on dedicated combinatorial operators specifically tailored to the DG-TSP. These operators are designed to dynamically adapt to changing problem conditions, such as fluctuating demands and environmental constraints, allowing for real-time decision making. The approach is embedded within the OCO framework [3], which provides a flexible structure for integrating adaptive strategies and evolutionary search. Extensive simulations and real-world case studies validate the performance of our method. Compared to well-established metaheuristics such as GA, PSO, GWO, and ACO, our approach yields superior results in both tour length (as a proxy for fuel efficiency) and overall offline performance, confirming its robustness in sustainable routing scenarios.

The key contributions of this article are as follows:

Development of dedicated combinatorial operators: We design adaptive operators for permutation-based optimization, capable of reacting to stochastic changes in travel conditions and road gradients.
Integration into an adaptive metaheuristic framework: The proposed operators are embedded into the OCO metaheuristic, enabling efficient and scalable dynamic optimization.
Rigorous empirical validation: Through dynamic benchmarks, real-world case studies, and comprehensive statistical testing, we demonstrate the superiority of our method over GA, PSO, GWO, and ACO.
Formulation of the DG-TSP: We formalize a novel problem definition that integrates dynamic service conditions and environmental sustainability into the classical TSP model.

The remainder of this paper is structured as follows: Section 2 presents the formal definition of the DG-TSP. Section 3 explores the dynamic aspects of the problem and reviews the related literature. Section 4 introduces the metaheuristic framework. Section 5 details the design of our combinatorial operators. Section 6 discusses the experimental setup and results. Section 7 concludes with implications, future work, and a summary of key findings.

2. Green Logistics

Green logistics is a wide-reaching field that balances economic, environmental, and social concerns [4]. Its goal is to reduce harm to ecosystems and living beings while supporting healthier communities and sustainable economic growth. It emphasizes initiatives that produce minimal detrimental impacts on the environment and incorporates methodologies and practices that facilitate societal progress and economic enhancement. The integration of environmental considerations into logistical operations is crucial for promoting sustainable development within society. Consequently, the aims extend beyond merely the economic ramifications of logistical operations to include a wider array of environmental and societal consequences, which encompass the evaluation of environmental contamination, the management of waste, and the analysis of consumer satisfaction. Green logistics functions as a strategic paradigm for organizations to attain sustainability while guaranteeing the enduring profitability of logistical endeavors [5]. The execution of green logistics (GL) entails a broad spectrum of activities and processes [6]. Zhang et al. propose a GL-based framework structured into three key categories: the assessment of green performance, execution of green operations, and the formulation of green strategies [7].

Generally speaking, logistics can be defined as the process of delivering a specified quantity of a product in an optimal condition to a designated customer at a precise location within a given timeframe [8]. Globalization and the decentralization of production have significantly contributed to the enhancement and diversification of logistics services. Today, logistics plays a crucial role across various essential sectors of society. Alongside its rapid development, there has been a growing awareness of environmental concerns in relation to logistics. Recent debates on environmental policies, coupled with rising oil prices, have underscored the necessity of minimizing vehicle movements whenever possible. As a result, research on logistics optimization has emerged as a pivotal field. The primary objective of logistics optimization is to reduce costs, minimize the number of vehicles required, and optimize travel routes. This study aims to address the TSP, a fundamental challenge in green logistics and transportation planning. The TSP is a classic NP-Hard problem [9], as its computational complexity grows exponentially with the size of the problem. Given a set of n cities

(V_{1}, V_{2}, V_{3}, \dots, V_{n})

and a distance matrix

(D = [d_{i, j}])

, where

(d_{i, j})

represents the distance between cities

V_{i}

and

V_{j}

, the objective is to determine a tour that allows the traveler to visit each of the n cities exactly once and return to the starting city. Naturally, the total distance traveled must be minimized. More formally, the problem involves finding a permutation

π

of the n cities that minimizes the following sum of distances:

\sum_{i = 1}^{n - 1} d_{π (i), π (i + 1)} + d_{π (n), π (1)}

(1)

To remain consistent with the rest of the paper, we define the problem over n cities, indexed from 1 to n, as reflected in the permutation-based cost function (1). This formulation captures the essence of the Green TSP, where the goal is to minimize a function of fuel consumption by identifying the optimal possible tour that satisfies the constraints of visiting each city exactly once and returning to the origin. Therefore, Equation (1) becomes

\sum_{i = 1}^{n - 1} C o n s u m p t i o n_{π (i), π (i + 1)} + C o n s u m p t i o n_{π (n), π (1)}

(2)

with Consumption representing a function that calculates the fuel consumption between two cities. The article by Moghdani et al. [10] provides a systematic and comprehensive literature review on the green vehicle routing problem. The paper categorizes solution methodologies into six main groups: metaheuristics, heuristics, software-based applications, exact methods, commercial exact solvers, and hybrid approaches. In the same context, Simulated Annealing [11] has been employed to minimize total CO₂ emissions and the total traveled distance. Ubeda et al. [12] applied Tabu Search to address routing problems and pollution reduction. Variable Neighborhood Search has been adopted within a hyper-heuristic framework to investigate the routing problem of mixed-energy vehicle fleets [13,14]. S. Zhang et al. [15] noted that Ant Colony Optimization is among the most popular metaheuristics for green logistics. Another metaheuristic, Multi-start Local Search, based on a multigraph reformulation, consists of two phases: the first employs fast constructive operators, while the second improves the solutions obtained in the first phase to address the green fleet problem [16]. The Non-Dominated Sorting Genetic Algorithm II (NSGA-II), a multi-objective genetic algorithm, was used for a Green Vehicle Routing Problem (Green-VRP) involving a three-echelon supply chain with uncertain customer demand [17]. The Firefly Algorithm (FA) was utilized to generate a set of feasible solutions for an asymmetric Green VRP with time windows, variable delivery deadlines, and heterogeneous vehicle dimensions [18].

The problem formulation can be stated as follows:

Mathematical Formulation of the Green Traveling Salesman Problem (Green TSP):
For completeness and consistency with the classical mathematical literature on the TSP, we also provide the standard integer programming formulation.
Let a complete graph $G = (V, E)$ , where
-
$V = {1, 2, \dots, n - 1}$ denotes the set of cities (with 1 as the start/return city);
-
E denotes the set of arcs between the cities.
The parameters and decision variables are defined as follows:
-
$c_{i j}$ : Environmental cost (e.g., CO₂ consumption) associated with traveling from city i to city j;
-
$x_{i j} \in {0, 1}$ : Binary variable equal to 1 if the tour travels from i to j, and 0 otherwise;
-
$u_{i}$ : Auxiliary variable used to eliminate subtours (Miller–Tucker–Zemlin formulation).
Objective function:

$min \sum_{i = 0}^{n - 1} \sum_{\begin{matrix} j = 0 \\ j \neq i \end{matrix}}^{n - 1} c_{i j} x_{i j}$

(3)
Constraints:

$\begin{matrix} \sum_{\begin{matrix} j = 0 \\ j \neq i \end{matrix}}^{n - 1} x_{i j} & = 1 & \forall i = 0, 1, \dots, n - 1 \end{matrix}$

(4)

$\begin{matrix} \sum_{\begin{matrix} i = 0 \\ i \neq j \end{matrix}}^{n - 1} x_{i j} & = 1 & \forall j = 0, 1, \dots, n - 1 \end{matrix}$

(5)

$u_{i} - u_{j} + n x_{i j} \leq n - 2, \forall i \neq j, i, j \in {1, \dots, n - 1}$

(6)

$1 \leq u_{i} \leq n - 1, \forall i \in {1, \dots, n - 1}$

(7)
Variable domains:

$\begin{matrix} x_{i j} & \in {0, 1} & \forall i, j = 0, \dots, n - 1, i \neq j \end{matrix}$

(8)
Principle of the MTZ Method:
In the TSP, the objective is to find a tour that visits each city exactly once and returns to the starting point. However, some solutions that satisfy degree constraints may contain subtours, i.e., smaller cycles not covering all cities. These are invalid for the TSP. The Miller–Tucker–Zemlin (MTZ) method is a classical technique for eliminating such subtours in integer linear programming formulations. An auxiliary continuous variable $u_{i}$ is introduced for each city $i \in {1, \dots, n - 1}$ (excluding city 0, the depot). This variable estimates the visit order of city i in the tour. Linear constraints are then imposed to enforce a logical progression in the tour, thereby preventing subtours. Let
-
$x_{i j} \in {0, 1}$ be a binary variable indicating whether arc $(i, j)$ is used;
-
$u_{i}$ be the order of visiting city i.
The constraints (6), (7) are added with the following implications:
-
If $x_{i j} = 1$ , the constraint becomes $u_{i} - u_{j} < 0$ , hence $u_{i} < u_{j}$ : city j is visited after i.
-
If $x_{i j} = 0$ , the constraint is trivially satisfied: $u_{i} - u_{j} \leq n - 1$ always holds.
This ensures a consistent visit order among cities and prevents the formation of closed cycles (subtours) among subsets of cities.
Illustration of a forbidden subtour:
In this subtour, cities 1, 2, and 3 form a closed cycle without visiting all cities (e.g., city 0 is excluded). The MTZ constraints prevent such configurations, as they would imply $u_{1} < u_{2} < u_{3} < u_{1}$ , which is contradictory.

In this work, we consider the environmental cost

c_{i j}

associated with traveling from node i to node j, which is used as the objective to minimize. This cost may represent fuel consumption or CO₂ emissions resulting from vehicle movement. To remain consistent with standard benchmark instances—where detailed information about vehicle load, speed, or road slope is not provided—we adopt a simplified, distance-proportional formulation:

c_{i j} = β \cdot d_{i j}

(9)

where

d_{i j}

denotes the Euclidean distance between nodes i and j, and

β > 0

is a conversion coefficient reflecting the average environmental impact per unit distance. This linear model is commonly used in the literature and facilitates fair comparison with existing works on dynamic routing problems. However, it is worth noting that more realistic models of environmental cost exist. For instance, Bektaş and Laporte [19] proposed a load-dependent cost function of the form:

c_{i j} = α \cdot d_{i j} + β \cdot d_{i j} \cdot w_{i j}

(10)

where

α

is the base cost per unit distance (for an unloaded vehicle),

β

is the marginal cost per unit of load and distance, and

w_{i j}

is the average load carried along arc

(i, j)

. Such models provide higher fidelity to real-world emissions but require additional input data, which are typically unavailable in dynamic TSP benchmarks. For simplicity and reproducibility, we adopt the linear model in this study, while acknowledging that the framework can easily accommodate more detailed cost functions in future work.

However, in the real case, this green voyage is not always static; instead, it is dynamic. The topology between cities can change, cities can be added or eliminated [20,21], or the distance between cities changes (taking a path because of works, for example) [21]. Consequently, the journey must be re-optimized by making the changes that arise and Equation (1) becomes

f (T o u r (t)) = (\sum_{i = 1}^{n (t) - 1} C o n s u m p t i o n_{T o u r (i), T o u r (i + 1)} (t)) + C o n s u m p t i o n_{T o u r (n), T o u r (1)} (t)

(11)

It is important to clarify that the function

{Consumption}_{i, j} (t)

used in Equation (11) corresponds to the environmental cost incurred when traveling from city i to city j at time t. In this work, we adopt a simplified linear model commonly used in the literature, where the environmental cost is directly proportional to the Euclidean distance between cities, such that

{Consumption}_{i, j} (t) = c_{i j} (t) = β \cdot d_{i j} (t)

Here,

d_{i j} (t)

denotes the time-dependent distance between nodes i and j, and

β > 0

is a fixed conversion coefficient representing the average environmental impact (e.g., fuel usage or CO₂ emissions) per unit of distance. This formulation allows for dynamic changes in the distance matrix

D (t)

and ensures consistency with standard benchmark assumptions.

The concept of DTSP was initially introduced by Psaraftis [22]. Since then, many variants of DTSP have been implemented, where the set of cities [23] and/or the travel costs of the set of cities [21] or the distance matrix D(t) change during the optimization process [24]. Youness and others [25] introduced a benchmark generator for this dynamic traveler problem with different modes:

Change in travel topology [26];
Change in distance between cities [20];
Variation in number of cities [26].

DG-TSP falls under the general category of real-time fleet management. Much of the current literature is characterized by algorithms that react to new demands only once they arise, while neglecting the available stochastic information [20]. Overviews of these problems can be found in Powell. Others, to react to the changes in the problem due to its dynamic nature, have planned a renewal by the introduction of new solutions. This aspect is known in the literature as the introduction of immigrants [20,26,27]. We also find another line of research, known as anticipatory routing, in which probability distributions are used to predict the change in road traffic and react according to these probabilities [28].

3. Dynamic Green Logistics

The TSP becomes more realistic when it is subject to a dynamic and environmentally friendly setting. For instance, our well-known salesman aims to distribute goods to various cities, departing from their home city and returning after visiting all cities while following a less polluting route. The objective is to optimize CO₂ emissions and plan the journey as efficiently as possible while preserving the environment and minimizing pollution. By considering the distances between cities, the salesman can generate the optimal itinerary and commence their tour. Navigating traffic delays while trying to minimize CO₂ emissions is a tricky situation. When unexpected congestion complicates the plan, the driver must rapidly adapt their decision making to identify an alternative route that avoids traffic while minimizing emissions. This need for a quick re-evaluation adds complexity to the routing process.

Dynamic Green TSP (DG-TSP) with Traffic Factors
In this paper, we generate Dynamic Green TSPs (DG-TSPs) by incorporating the traffic factor, which directly affects CO₂ emissions. We assume that the cost of the link between cities i and j is given by $d_{i j} \times t_{i j}$ , where $d_{i j}$ represents the normal travel distance and $t_{i j}$ denotes the traffic factor. Every f iteration of running an algorithm, a random number $R \in [F L, F U]$ is generated to simulate potential traffic congestion, where $F L$ and $F U$ are the lower and upper bounds of the traffic factor $t_{i j}$ , respectively, [29]. It is needless to say that traffic congestion significantly exacerbates air pollution.
Each link between cities i and j has a probability m of being affected by traffic, wherein a distinct R value is generated to represent low, normal, or high levels of traffic congestion on different roads. Meanwhile, the remaining links are assigned $t_{i j} = 1$ , indicating an absence of traffic.
For instance, high-traffic roads are generated by assigning a higher probability to R values closer to $F U$ , whereas low-traffic roads are generated with a higher probability of assigning R values closer to $F L$ . This class of DG-TSP is referred to as random DG-TSP in this paper, since previously visited environments are not guaranteed to reappear [30].
Cyclic Dynamic Green TSP
Another variation of the DG-TSP with traffic factors is the Cyclic DG-TSP, in which dynamic changes follow a cyclic pattern. In other words, previous environments are guaranteed to reoccur in the future. Such environments are more realistic than purely random ones, as they can, for instance, model the 24-h traffic dynamics of a typical day [31].
A cyclic environment can be constructed by generating different dynamic scenarios with traffic factors as the base states, representing DG-TSP environments with low, normal, or high levels of traffic. The environment then transitions cyclically among these base states in a fixed logical sequence. Depending on the time of day, environments with varying traffic conditions can be generated. For example, during rush hours or on roads with steep slopes, there is a higher probability of generating R values closer to $F U$ , whereas during off-peak evening hours, a higher probability is assigned to R values closer to $F L$ .

3.1. Formulation of the DTSP

Formally, a dynamic optimization problem (DOP) can be defined as follows:

Ω = {(X (t), Λ (t), f (t))}_{t \in T}

(12)

where

$Ω$ denotes the optimization problem;
$X (t)$ is the search space at time t;
$Λ (t)$ is the set of constraints at time t;
$f (t)$ is the objective function at time t, which assigns an objective value to each solution $x \in X (t)$ , with all components being time-dependent;
T is the set of time values.

For the static TSP, the objective function is to minimize the total distance of a tour:

f (x) = min \sum_{i = 0}^{n - 1} \sum_{j = 0}^{n - 1} d_{i j} x_{i j}

(13)

subject to

x_{i j} = \{\begin{matrix} 1, & if (i, j) is included in the tour \\ 0, & otherwise \end{matrix}

(14)

where n is the number of cities and

d_{i j}

is the distance between city i and city j. In the context of the DTSP with traffic factors, the cost of traveling between cities i and j becomes

d_{i j} \times t_{i j}

. Although the explicit mathematical formulation for the DTSP with traffic factors is not rewritten as a new form of

f (x)

, the introduction of

t_{i j}

implies that the distances

d_{i j}

(or travel costs) are now time-dependent, i.e.,

d_{i j} (t)

, reflecting changes due to traffic. The goal then becomes to track the optimum as these values

d_{i j} (t)

evolve over time. In summary, the DTSP introduces temporal dynamics and uncertainty into routing problems, making them more complex and more relevant to real-world applications such as route planning under changing traffic conditions.

Definitions and Notations:
Let a dynamic graph be defined as $G_{t} = (V_{t}, E_{t})$ for each time step $t \in T = {0, 1, \dots, H}$ where H is the time horizon.
-
$V_{t}$ : Set of available cities at time t;
-
$E_{t}$ : Set of valid arcs at time t;
-
$x_{i j t} \in {0, 1}$ : Equal to 1 if the route from city i to j is taken at time t;
-
$c_{i j t}$ : Ecological cost (e.g., CO₂ emissions, energy) for the route $(i, j)$ at time t;
-
$d_{i j t}$ : Distance or travel time between i and j at time t;
-
$a_{i t}$ : Equal to 1 if city i is available at time t, 0 otherwise;
-
$u_{i}$ : Position of city i in the tour (for subtour elimination);
-
$n = | V |$ : Total number of cities to be visited (fixed).
Objective Function:
Minimize the cumulative environmental cost (CO₂ emissions) of the dynamic tour:

$min \sum_{t = 0}^{H} \sum_{i \in V_{t}} \sum_{\begin{matrix} j \in V_{t} \\ j \neq i \end{matrix}} c_{i j t} \cdot x_{i j t}$

(15)
Constraints:
- Each city is visited exactly once over the horizon:
  
  $\sum_{t = 0}^{H} \sum_{\begin{matrix} j \in V_{t} \\ j \neq i \end{matrix}} x_{i j t} = 1 \forall i \in V$
  
  (16)
  
  $\sum_{t = 0}^{H} \sum_{\begin{matrix} j \in V_{t} \\ j \neq i \end{matrix}} x_{j i t} = 1 \forall i \in V$
  
  (17)
- Temporal availability of cities and routes:
  
  $\{\begin{matrix} u_{i} - u_{j} + n \cdot \sum_{t = 0}^{H} x_{i j t} \leq n - 1 & \forall i \neq j, i, j \in V ∖ {0} \\ 1 \leq u_{i} \leq n & \forall i \in V ∖ {0} \end{matrix}$
  
  (18)
- Subtour elimination (MTZ formulation)
- Departure from and return to the initial city (city 0):
  
  $\{\begin{matrix} \sum_{t = 0}^{H} \sum_{j \neq 0} x_{0 j t} = 1 \\ \sum_{t = 0}^{H} \sum_{i \neq 0} x_{i 0 t} = 1 \end{matrix}$
  
  (19)
- Domain of the decision variables:
  
  $x_{i j t} \in {0, 1} \forall i, j \in V, t \in T$
  
  (20)
  
  $u_{i} \in Z \forall i \in V$
  
  (21)
Remarks:
-
The graph may evolve over time: certain cities may become available or unavailable, and the costs $c_{i j t}$ may vary.
-
The cost $c_{i j t}$ incorporates CO₂ emissions.
-
This model can be used in real-time or predictive environments (smart logistics, electric vehicles, etc.).

3.2. Integrating Real-World Constraints into DG-TSP for Green Logistics

The DG-TSP is an emerging research area aimed at enhancing the sustainability of logistics by accounting for real-world complexities such as dynamic travel conditions, road topography, and fuel consumption optimization. Here, is a breakdown of the key areas within the DG-TSP:

Time-dependent service requests: These add a layer of complexity because the problem needs to adapt in real time. Research has explored ways to handle this uncertainty, using methods like scenario-based planning and reinforcement learning [32]. The focus is on creating algorithms that can efficiently manage these changing requests.
Slopes with different inclinations: Factoring in actual road conditions is essential for accurately estimating fuel consumption. Studies show that using detailed road gradient data can significantly improve fuel consumption and overall efficiency [33].
Fuel consumption optimization: This is a core aspect of making logistics sustainable. Studies demonstrate that incorporating fuel consumption models into routing algorithms can lead to major reductions in greenhouse gas emissions [34]. Smart technologies also play a role in optimizing routes to lower costs and environmental impact [34].
Sustainable logistics and smart technologies: Using smart technologies in logistics is becoming increasingly common for sustainability. This includes decision support systems and key performance indicators that rely on data to optimize operations. Digital twin technology can also make public transport systems more efficient through simulations and real-time insights.

In essence, the DG-TSP aims to create more flexible and environmentally friendly logistics by tackling real-world challenges and using advanced technologies.

4. Open Competency Optimization

A robust metaheuristic requires a careful balance between exploration and exploitation. An algorithm’s exploration capability refers to its ability to investigate diverse regions of the feasible solution space. Conversely, exploitation signifies the algorithm’s ability to guide individuals towards optimal solutions with utmost expediency. Excessive focus on exploration results in a purely stochastic search process, while overemphasis on exploitation yields a search approach prone to becoming trapped in a local optimum. The OCO algorithm [3] is meticulously crafted to achieve a harmonious balance between exploration and exploitation, as it is predicated on a contemporary framework of human learning. Competency-based learning empowers learners to formulate their own educational trajectories in accordance with their competencies while deriving inspiration from their peers. The objective is to steer learners towards the global optimum with maximal efficiency. This paradigm is applicable to the resolution of an open-ended problem within a cohort of learners, which may be restricted to a small classroom setting or extended to an online course wherein multiple students engage in synchronous interactive learning. A salient advantage of this inspiration-based methodology is that it cultivates a population of ideas aimed at addressing a specific problem, in contrast to a population of individuals, as observed in other algorithms modeled after insect or animal behaviors. In our algorithm, the population size is not static; rather, it is dynamic, as learners perpetually respond to and engage with the ideas proffered by their peers. Each learner possesses the capacity to generate one or multiple ideas predicated on their cognitive abilities and the feedback received from others regarding their proposals. The principal objectives of this approach are delineated as follows:

Each student formulates their unique educational trajectory predicated upon their specific skill sets;
Each student engaged in the learning process collaborates with the most immediate cohort, whether this is defined by geographical closeness or intellectual competencies. It is important to highlight that the size of this group is limited to a maximum of five participants;
Students have the opportunity to interact with each other via deliberations or by embracing advanced recommendations (i.e., optimal resolutions) from their colleagues.

Nevertheless, an algorithm may become ensnared in the constructs of optimal local spaces when addressing a multifaceted problem that encompasses numerous local optimal solutions. To mitigate premature convergence and to achieve a harmonious balance between exploration and exploitation capabilities, the proposed algorithm incorporates a variety of corrective measures. On one side, the average component or centroid facilitates the expansion of the search domain of the OCO algorithm. Conversely, learners are enabled to alter specific research concepts by modifying the interpretation of these concepts (adjusting one component of a vector while maintaining the others constant during self-learning updates). Algorithm 1 and Figure 1 provide a comprehensive overview of the proposed OCO algorithm.

Figure 1. Flowchart of OCO.

Algorithm 1 OCO algorithm: principal steps

1:: Initialize N students (populations of solutions); $t = 0$ ;
2:: while $t < MaxIteration$ do
3:: Assess each student utilizing the fitness function.;
4:: if capacity > ThresholdCapacity then
5:: Update students along self-learning;
6:: end if
7:: Update students along neighbor learner groups;
8:: Update students along leadership interactions;
9:: $t = t + 1$ ;
10:: end while
11:: Return the best solution(s);

The Algorithm 1 represents the main optimization framework, while Algorithms 2–4 correspond to procedures that are called conditionally during execution, depending on the context.

As detailed in Section 4.1, Algorithm 2 implements a self-learning operator activated only in dynamic optimization scenarios where the environment or demand evolves over time. A control parameter, ThresholdCapacity, governs the decision to inject new solutions into the population, formalized in Equation (23). When this threshold is exceeded, Algorithm 2 handles the insertion of these candidate solutions to maintain population diversity and adaptability.

Following the decision to inject new solutions, a group of g candidate solutions must be generated. This group is constructed either by Algorithm 3 or Algorithm 4, depending on the diversification strategy employed:

Algorithm 3 generates solutions through a neighborhood-based approach leveraging local information for guided exploration.
Algorithm 4 constructs solutions randomly to enhance exploration capabilities.

Within the main optimization loop (Algorithm 1), the leadership interaction procedure is designed to enhance the quality of solutions through a guided learning strategy based on high-performing individuals. Specifically, each candidate solution is influenced by one or more leaders, selected based on their superior fitness values. The adjustment of each solution under this mechanism follows a structured update formula that is mathematically expressed in Equation (28). Equation (28) defines how the solution vector is modified by incorporating both exploitation (following the leader) and controlled stochastic perturbations to preserve diversity. This leadership-based learning balances convergence and exploration, allowing the algorithm to efficiently refine promising areas of the search space while avoiding premature stagnation. The implementation of this mechanism significantly contributes to the intensification phase of the algorithm and supports global optimization performance.

4.1. Self-Learning

Each individual engaged in the learning process has the capacity to enhance their understanding by integrating foundational concepts and reinterpreting them through the application of inquiry regarding the novel information presented by the specific problem context. The efficacy of the competency-oriented pedagogical framework resides in its potential to stimulate and motivate learners to generate novel knowledge by interrogating pre-existing understandings, modifying and adapting such knowledge to yield fresh perspectives and juxtaposing these with antecedent insights. The articulation of these cognitive engagements can be delineated as follows for a student

X_{i}

:

X_{i, r} = X_{i, r} + r a n d_{r} \cdot (x_{k} - X_{i, r})

(22)

where r is an index randomly chosen from 1 to D, D is the dimension of the learning vector (proposed problem solution),

r a n d_{r}

is a random number between 0 and 1, and

x_{k}

is a value randomly selected from the domain of values for the element at index r. Nevertheless, certain individuals engaged in the learning process exhibit challenges in autonomously strategizing their educational trajectory. This constraint is contingent upon the individual competencies of each student, prompting us to establish a threshold of capacity. In this study, it should be noted that the Self-Learning operation, illustrated in Figure 1 and described in Algorithm 1, is only triggered under the first condition specified in Algorithm 2. This operator is designed to introduce diversity within the population, which is essential for maintaining adaptability in dynamic optimization scenarios. By selectively applying self-learning, the algorithm can explore new regions of the search space and better respond to environmental changes.

For a learner X_i (

x_{i 1}

,

x_{i 2}

,…,

x_{i D}

),

x_{ir}^{new} = \{\begin{matrix} \begin{matrix} x_{ir}^{old} + rand \cdot (x_{ir}^{old} - x_{i k}), \\ or x_{ir}^{old} + rand \cdot (- x_{i l} + x_{i k}), \\ or x_{ir}^{old} + rand \cdot (- B e s t_{m} + x_{i k}), \\ or x_{i k}, iff (X_{i}^{new}) < f (X_{i}^{old}) \end{matrix} \\ x_{ir}^{old}, & else \end{matrix}

(23)

where r, k, l, and m represent indices randomly selected from the set of integers ranging from 1 to D, where D denotes the dimensionality of the learning vector (the proposed resolution of the problem), and rand signifies a stochastic variable constrained within the interval [0, 1]. Equation (23) proposes three different update strategies to improve the quality of the new solution

X_{i}^{new}

. If none of these strategies leads to an improvement, i.e., if

f (X_{i}^{new}) < f (X_{o i}^{old})

, then the component

x_{i r}^{new}

is assigned a randomly selected value

x_{i k}

, where

k \in {1, 2, \dots, D}

and D is the problem dimension. More precisely, the algorithm evaluates the three proposed update rules of the candidate solution

X_{i}^{n e w}

. The strategy that yields the lowest objective function value is selected and assigned to

x_{n e w}^{i r}

. This selective mechanism ensures that each variable update contributes optimally to improving the overall solution quality. This random assignment is conducted without applying any transformation, unlike the three enhancement formulas proposed in Equation (23). If this substitution does not result in a better solution, the algorithm keeps the original value

x_{i r}^{old}

.

The ThresholdCapacity parameter plays a critical role in determining whether a learner is eligible for the self-learning operation. A higher threshold (e.g., 0.8) restricts self-learning to individuals with strong internal capacities, thus favoring the exploration of promising solutions. In contrast, a lower threshold (e.g., 0.1) allows a wider range of individuals to engage in self-learning, thereby enhancing population diversity and exploration. This dynamic adjustment enables better adaptability to changes in the optimization landscape, particularly in dynamic problem settings. Conversely, the population size is characterized by its dynamism, attributed to the algorithm’s consideration of learners’ ideas rather than their numerical representation. Thus, each learner is permitted to propose one or more ideas, with the stipulation that this quantity shall not surpass five-fourths of the initial population size. Equation (23) delineates four strategies for the adaptation of solutions and the enhanced exploration of the search space; thus, the algorithm is capable of incorporating one or two solutions into the learners’ population, contingent upon their adaptation (Idea_Adaptation). Algorithm 2 encapsulates the requisite conditions.

Algorithm 2 Self-learning conditions

1:: $c a p a c i t y = rand (0, 1)$
2:: $I d e a A d a p t a t i o n$ randomly chosen $\in [0, 1]$
3:: if $c a p a c i t y > ThresholdCapacity$ then
4:: Update along self-learning
5:: if $NewPopulationSize < \frac{5}{4} \times PopulationSize$ then
6:: if $I d e a A d a p t a t i o n < 0.05$ then
7:: Insert one solution;
8:: $NewPopulationSize \leftarrow NewPopulationSize + 1$
9:: else
10:: if $ThresholdCapacity = 0.1$ then
11:: Insert two solutions;
12:: $NewPopulationSize \leftarrow NewPopulationSize + 2$
13:: end if
14:: end if
15:: end if
16:: end if

4.2. Neighbor Learner Groups

The student, in their quest for knowledge acquisition, is influenced by the social network that encompasses them. This influence is dependent on their relative standing within the network as well as their analogous competencies. The membership of this network is confined to a maximum of five participants. Moreover, the configuration of the student roundtable within a network is systematically modified following a predetermined number of generations (the attraction period within a network) to ensure that the dissemination of information encompasses all learners, thereby enhancing exploratory capabilities and facilitating the examination of other promising domains within the search space. This learning roundtable adheres to the following formula:

R T = | α \cdot X_{g} - X |

(24)

X_{new} = X_{g} - β \cdot R T

(25)

where g denotes the number of students in the group and is randomly selected in the set 2,…, 5, while

α

and

β

(Figure 2) are two coefficients of learning attraction computed in the manner delineated below:

β = r 1 \times \frac{2.5}{exp (\frac{t^{2}}{2 {(LIter / 3)}^{2}}) \times \sqrt{2 π}}

(26)

α = r 2 - \frac{t}{LIter}

(27)

where r1 are r2 are arbitrarily chosen within the interval [0; 1]. The coefficients

α

and

β

exhibit a linear decrement across successive generations, with

β

adhering to a Gaussian distribution while remaining within the confines of the learning attraction group, whereas LIter denotes the upper limit of iterations.

The variables that have previously been identified enable the student to engage effectively within their educational roundtable.

Figure 2. Variations in

α

and

β

over generations.

Figure 2. Variations in

α

and

β

over generations.

Algorithm 3 Strategy for the student group within a limited neighborhood

1:: $g = rand (2, 5)$ /* random number ${2, \dots, 5}$
2:: $R \leftarrow 0$
3:: for $j \leftarrow 1$ to g do
4:: $R T_{j} = | α_{i} \cdot X_{i + j} - X_{i} |$
5:: $R = R + X_{i + j} - β_{j} \cdot R T_{j}$
6:: end for
7:: $R = R / g$
8:: if $f (R) < f (X_{i})$ then
9:: $X_{i} = R$
10:: end if

A student can augment their proficiency not solely by engaging with their immediate environment but also by seeking connections with fellow learners who exhibit analogous competencies. As a result, the algorithm optimally leverages the potential of the neighborhood. The pseudocode of Algorithm 3 was refined at the instruction level to promote educational interactions within a nearest-neighbor cohort for

X_{i}

(comprising

X_{i + 1}, X_{i + 2}, \dots, X_{i + g}

), wherein the group size is restricted to a maximum of five randomly selected learners with similar skills. These enhancements are encapsulated in the pseudocode presented in Algorithm 4. Furthermore, an additional strategy was employed in this research, as delineated in Algorithm 4.

Algorithm 4 Strategy of the randomly chosen learner group

1:: $g = rand (1, 5)$ /* random number ${1, 2, \dots, 5}$ */
2:: $R \leftarrow 0$
3:: for $j \leftarrow 1$ to g do
4:: $K = rand (1, LN)$ /* random number ${1, 2, \dots, LN}$ */ ▹ LN Number of learners
5:: $R T_{j} = | α_{i} \cdot X_{K} - X_{i} |$
6:: $R = R + X_{i + j} - β_{j} \cdot D_{j}$
7:: end for
8:: $R = R / g$
9:: if $f (R) < f (X_{i})$ then
10:: $X_{i} = R$
11:: end if

Both algorithms indeed perform the insertion of g solutions into the population; however, their insertion strategies differ significantly. Algorithm 3 implements a neighborhood-based insertion approach, which aims to insert solutions by exploiting the local structure of the solution space. In contrast, Algorithm 4 generates the g solutions randomly and inserts them without considering neighborhood information.

4.3. Leadership Interaction

In the domain of competency-based education, an individual’s learning is shaped by the paradigms established by the highest-achieving peer and exhibits a favorable reaction to authoritative guidance. This authority figure is not exclusively the individual with the highest achievement but rather the learner who is most centrally located within the spectrum of group performance, effectively representing the overall average of the cohort. This reciprocal engagement cultivates the enhancement and evolution of each learner’s skill set. The repercussions of these interactions, both with the leading performer and the median of the group, reside in the examination of potentially fruitful avenues for exploration. A comprehensive elucidation of this collaborative group methodology is articulated as follows:

Each student assimilates or responds to the concept of optimality. Either the optimal solution is retained or alternative avenues for the optimal solution are exploited. The competency-based paradigm is inadequately encapsulated by the notion of the middle class. Within conventional educational frameworks, students are predominantly influenced by their peers who perform at an average level. Those who excel or struggle significantly are often excluded from the benefits of such pedagogical methodologies. The entirety of learning within this paradigm tends to prioritize certain means while disregarding alternative approaches. To mitigate this predicament, each student engages with the average in accordance with their individual capabilities, as they cultivate their own competencies. This dynamic enhances the exploratory essence of the algorithm, and consequently, facilitates the enrichment of diverse ideas (solutions to the problem).

The equation representing this interaction within the framework of competency-based learning can be expressed for a student

X_{i}

utilizing Formula (28).

x_{i}^{new} = C_{i} \cdot x_{i}^{old} + ε_{i} \cdot (X_{best} - λ \cdot X_{mean})

(28)

where

λ

is a learning coefficient that belongs to the set

{1, 2}

,

X_{mean}

is the average value of all learners,

X_{best}

is the value of the best learner,

C_{i}

is the learner’s capability for

X_{i}

in Formula (29), and r1 is a random number (r1∈ [0,1]).

C_{i} = \frac{1}{r 1 + {(exp (\frac{- f_{i}}{max (f_{i})}))}^{Iter}}

(29)

where

ϵ_{i}

is a modulation coefficient belonging to

[0, 1]

, as shown in Formula (30).

ϵ_{i} = \frac{1}{r 1 + exp (- \frac{f_{i}}{max (f_{i})}) \times Iter}

(30)

where

f_{i}

is the competency of the learner

X_{i}

(the fitness function) and

max (f_{i})

is the best capability (best value of the fitness function) over the learning cycle. The parameters

ϵ_{i}

and

C_{i}

make it possible to control the diversity and improve the speed of convergence.

5. Operators for the TSP

This section introduces a novel toolkit of arithmetic operators tailored towards tackling combinatorial optimization problems, especially the TSP. Instead of using regular addition, multiplication, subtraction, and scalar multiplication, these operators are redefined to work with the discrete, permutation-based nature of TSP routes. The following sections will go into the specifics of these operators and how they are implemented.

5.1. Addition ⊕

When we “add” two TSP tours, written as

S_{1} \oplus S_{2}

, we employ the cycle crossover (CX) operator. CX, a well-established genetic algorithm technique, generates offspring by preserving cycles from parent tours while ensuring feasibility. In this context, CX is applied to

S_{1}

and

S_{2} *

to produce a child tour. This ensures that we always obtain a high-quality solution, which is often optimal or very close to it (Figure 3).

The following example highlights this addition operator in the case of the TSP:

$S_{1}$ = [1, 2, 3, 4, 5, 6, 7, 8]
$S_{2}$ = [4, 3, 2, 1, 6, 7, 8, 5]
$S_{1} \oplus S_{2} =$ $[1, 3, 2, 4, 5, 6, 7, 8]$

5.2. Multiplication ⊗

When multiplying two routes, represented as

S_{1} \otimes S_{2}

, the Partially Mapped Crossover operator [35] comes into play. PMX is useful because it keeps the positions of cities consistent between the “parent” routes while creating new, viable routes. The PMX crossover generates two candidate offspring routes. Among these, the route with the lowest cost—measured in terms of CO₂ consumption—is selected as the outcome of the multiplication operator.

This approach takes advantage of PMX’s ability to maintain good sub-route structures while also introducing variety into the search for the best overall route (Figure 4).

The following example illustrates this multiplication operator in the context of the TSP; however, between the two resulting solutions, the best tour is selected:

$S_{1}$ = [1, 2, 3, 4, 5, 6, 7, 8]
$S_{2}$ = [5, 6, 7, 8, 1, 2, 3, 4]
$R_{1} =$ $S_{1} \otimes S_{2} =$ [1, 6, 3, 4, 5, 2, 7, 8]
$R_{2} =$ $S_{1} \otimes S_{2} =$ [5, 2, 7, 8, 1, 6, 3, 4]
$f (X) = \sum_{i = 1}^{7} d_{π (i), π (i + 1)} + d_{π (8), π (1)}$

We compute this sum for both

R_{1}

and

R_{2}

. If f(

R_{1}

) < f(

R_{2}

), then

R_{1}

is selected; otherwise,

R_{2}

is chosen.

5.3. Subtraction ⊖

The subtraction operation, denoted as

S_{1} ⊖ S_{2}

, is defined as the addition of

S_{1}

and the negation (inverse) of

S_{2}

. Negation involves reversing the sequence of cities in

S_{2}

to create an inverse tour. Once the negated version of

S_{2}

is obtained, the previously defined addition operation (

S_{1} \oplus

negated

S_{2}

) is applied to compute the result. This definition extends the concept of subtraction to the combinatorial domain of TSP tours while ensuring consistency within the proposed framework.

For the subtraction operator, it is necessary to invert the second operand and then perform the addition.

$S_{1}$ = [1, 2, 3, 4, 5, 6, 7, 8]
$S_{2}$ = [5, 6, 7, 8, 1, 2, 3, 4] reverse( $S_{2}$ ) = [4, 3, 2, 1, 8, 7, 6, 5]
$S_{1} ⊖ S_{2}$ = $S_{1} \oplus$ reverse( $S_{2}$ ) = [1, 3, 2, 4, 5, 7, 6, 8]

5.4. Scalar Multiplication ⊙

Scalar multiplication, denoted as

k ⊙ S_{i}

, where k is a scalar value, is implemented through a two-step process:

Initially, the scalar $(K)$ undergoes an initial transformation via one of the nine predefined transformation functions proposed in this study, converting it into an integer value within the set ${1, 2, \dots, N}$ , where N denotes the population size. Following this conversion, the population is systematically ranked based on the fitness function. The swap mutation operator selects two random positions within the solution and exchanges their elements. This simple yet effective operation helps to maintain the permutation structure of the solution, which is crucial for routing problems. For example, if the solution is [1, 3, 2, 4], swapping positions 2 and 3 results in [1, 2, 3, 4].
A swap mutation is then applied to the tour $f (K)$ , where $f (K)$ denotes the index of the solution within the population. The outcome of this mutation is a tour, which we shall designate as ( $S_{sm}$ ). Subsequently, we apply the multiplication operation, as defined in this study, with $S_{i}$ ( $S_{i} \times S_{sm}$ ).
Additionally, a heuristic mutation is applied. This mutation employs a problem-specific improvement strategy, such as 2-opt, to refine the tour further and enhance its quality.

This dual-stage approach ensures that scalar multiplication not only introduces variability but also integrates domain-specific heuristics to maintain solution quality. To illustrate the multiplication of a TSP tour (solution) by a scalar, the following example demonstrates this operation.

$S_{1}$ = [1, 2, 3, 4, 5, 6, 7, 8]
k = 0.25

The majority of functions employed in this study yield values approximating 25 (in this example). Consequently, the selected solution corresponds to the index equal to 25, specifically [2, 1, 3, 6, 5, 4, 7, 8].

After using swap mutation for [2, 1, 3, 6, 5, 4, 7, 8],

S_{s m}

= [2, 5, 3, 6, 1, 4, 7, 8].

S_{1} \otimes S_{s m} =

[5, 2, 3, 6, 1, 4, 7, 8].

After this, we apply heuristic mutation: [2, 8, 7, 4, 1, 6, 3, 5].

Let

x \in [0, 1]

be a uniform real number. Nine transformation functions are defined as mappings from the continuous interval [0,1] to the discrete ordinal set

{1, 2, 3, \dots, N}

, with N representing the population size.

Exponential transformation (parametric control):

$T_{2} (x) = ⌊ 100 x^{k} ⌋ + 1, k > 0$

(31)

For $k > 1$ , smaller integers are favored (with an enhanced threshold effect).
Sigmoidal function (central concentration):

$T_{3} (x) = ⌊\frac{100}{1 + e^{- a (x - 0.5)}}⌋ + 1, a ≫ 1$

(32)

Parameter a controls the transition slope (typically $a \geq 10$ ).
Equidistant partition (indicator function):

$T_{4} (x) = \sum_{i = 1}^{100} i \cdot I_{[\frac{i - 1}{100}, \frac{i}{100})} (x)$

(33)

Exact implementation of a discrete uniform CDF.
Power law (controlled bias):

$T_{5} (x) = ⌈ 100 x^{n} ⌉, n \in N^{*}$

(34)

For $n = 2$ , $P (y \leq 50) \approx 75 %$ .
Logarithmic transform (long tail):

$T_{6} (x) = min (⌊ - 100 ln (1 - x) ⌋ + 1, 100)$

(35)

Requires special handling of $x = 1$ (to avoid $ln (0)$ ).
Trigonometric modulation (periodicity):

$T_{7} (x) = ⌊ 50 (sin (2 π x) + 1) ⌋ + 1$

(36)

Produces a bimodal distribution (favors extreme values).
CDF inverse (distributional adaptation):

$T_{8} (x) = T^{- 1} (x), T CDF Cumulative Distribution Function .$

(37)

Allows the imposition of any arbitrary discrete distribution.
Stochastic method (secondary randomization):

$T_{9} (x) \sim U {1, \dots, ⌈ 100 x ⌉}$

(38)

Introduces additional variance controlled by x.
Linear–nonlinear mixture (compromise):

$T_{10} (x) = ⌊ 50 (x + x^{2}) ⌋ + 1$

(39)

Interpolation between linear (x) and quadratic ( $x^{2}$ ) behaviors.

For all functions $T_{i}$ , we have $T_{i} (0) = 1$ and $T_{i} (1) = 100$ , except
-
$T_{6}$ (logarithmic transform) requires $x < 1$ to avoid the divergence of $ln (1 - x)$ at $x = 1$ ;
-
$T_{7}$ (trigonometric modulation) reaches its maximum value (101) at $x = 0.25$ and $x = 0.75$ , but not at $x = 1$ ;
-
$T_{9}$ (stochastic method) requires $x > 0$ to ensure the upper bound of the uniform distribution is at least 1.
All functions are piecewise continuous except for $T_{4}$ (equidistant partition), which is purely discrete.
Functions $T_{4}$ (uniform partition) and $T_{8}$ (CDF inverse) preserve important theoretical properties:
-
$T_{4}$ ensures fairness via uniform discretization;
-
$T_{8}$ is inherently invertible if the target CDF is strictly increasing.
Functions $T_{2}$ (exponential), $T_{5}$ (power law), and $T_{10}$ (linear–quadratic mixture) allow the explicit parametric control or structural modulation of the output distribution.
$T_{9}$ is the only non-deterministic transformation, introducing randomness via a uniform integer sampling based on x.
All functions require explicit boundary treatment to ensure well-defined behavior over the domain $[0, 1]$ .

In the experimental evaluation, the nine transformations are selected probabilistically at each iteration, thus promoting diversity across the search space (Figure 5).

5.5. Rationale and Advantages

To illustrate these operators, which enable techniques for transitioning from real encoding to permutation encoding for the TSP, we will consider the example of Equation (28).

x_{i}^{new} = (C_{i} ⊙ x_{i}^{old}) \oplus (ε_{i} ⊙ ((X_{best} ⊖ (λ ⊙ X_{mean}))))

(40)

The proposed arithmetic operators address the unique challenges of combinatorial optimization problems, particularly those encountered in TSP. By redefining traditional arithmetic operations in terms of crossover and mutation techniques, the methodology facilitates the manipulation of tours while respecting problem constraints. The key advantages of this approach include the following:

Feasibility preservation: Each operator ensures that the resulting tours remain valid permutations of cities, avoiding infeasible solutions;
Diversity promotion: The integration of crossover and mutation mechanisms enhances exploration of the search space;
Heuristic integration: The inclusion of heuristic mutation in scalar multiplication balances exploration and exploitation, thereby improving solution quality;
Scalability: The modular design of the operators allows for adaptation to other combinatorial problems beyond TSP.

Empirical evaluation of the proposed operators on benchmark TSP instances will validate their performance and demonstrate their potential applicability to related optimization problems.

6. Experimental Studies

Dynamic optimization can be defined as a sequence of multiple instances of a static problem that are interconnected through certain dynamic rules. The primary aspects of “dynamism” are the frequency and magnitude of environmental changes. Frequency refers to the rate at which changes occur, while magnitude corresponds to the degree of these environmental changes. A change may involve factors such as the objective function, input variables, or problem constraints, for example. Environmental changes are classified as either dimensional or non-dimensional. Both types of changes cause the optimum to shift in position, shape, and value. In a dynamic problem, it is not only necessary to detect the optimum but also to track it over time. However, with such problems, the population of solutions tends to lose diversity over generations.

In [3], we examined the diversity mechanism of OCO in comparison with the random-immigrant genetic algorithm (RI-GA) and the hypermutation genetic algorithm (HMGA). In this study, the concept of novel ideas emerging from the learning process (immigrant insertion: implicitly present) was applied to OCO.

To assess the impact of OCO’s remedial strategies on population diversity, we recorded population diversity at each generation for every OCO run on dynamic optimization problems (DOPs). The mean population diversity of OCO across DOP generations over 30 runs was computed using the following formula:

Div (t) = \frac{1}{30} \sum_{k = 1}^{30} (\frac{1}{ln (n - 1)} \sum_{i = 1}^{n} \sum_{j \neq i}^{n} d_{i, j} (k, t))

(41)

where

D_{i, j} (k, t)

denotes the Euclidean distance between the i^th and j^th individuals at generation t of the

k^{t h}

run. Our findings revealed that OCO maintained the highest level of population diversity, whereas RIGA and HMGA retained the lowest. The results of the aforementioned study demonstrated that OCO, through parameter control, sustained high diversity levels across diverse dynamic environments.

6.1. Comparison with Other Metaheuristics

In the first experiment, we evaluated OCO against PSO, ACO, GA and GWO. However, the results obtained from these four comparative metaheuristics proved to be significantly suboptimal. The primary reason for this is that naturally, metaheuristics are designed to converge towards the optimum. Convergence simply means a set of solutions that repeats within the population of individuals. Consequently, the algorithm can no longer track solutions effectively.

Comparing the four metaheuristics designed exclusively for stationary optimization with our OCO metaheuristic, which is inherently crafted for both static and dynamic environments (incorporating the concept of new ideas emerging during the learning process), constitutes an inherently unfair comparison.

To address this disparity, we adopted a technique widely employed in dynamic optimization to track shifting optima: random immigrant insertion. This approach mitigates the limitations of static metaheuristics when applied to dynamic environments. The performance of our algorithm is tested on the DG-TSP. We generate the Dynamic TSP by introducing a factor

t_{i j}

:

t_{i j} = \{\begin{matrix} 1 + R & if R \in [F_{L}, F_{U}) and q \leq m, \\ 1.0 & otherwise . \end{matrix}

(42)

where

F_{L} = 1

,

F_{U} = 5

, and q is a uniformly distributed random number in the interval

[0, 1]

.

In another variant of the dynamic TSP, changes occur in a cyclical manner. This cyclical environment can be constructed by simulating light, normal, or heavy traffic, with these changes occurring across different iterations (representing periods of the day). Here,

t_{i j} = c y c l i c_{i j} (s t a t e)

:

{cyclic}_{i j} (f r e q) = \{\begin{matrix} 1 + R & if R \in [F_{L}, F_{U}) and q \leq m, \\ 1.0 & otherwise . \end{matrix}

(43)

where

(m)

is a uniformly distributed random number in the interval

[0, 1]

;

f r e q \in {5 \dots, 100}

represents the state of environmental change, whether rapid or slow.

In this study,

(m)

takes the values 0.1, 0.25, 0.5, and 0.75, indicating the degree of traffic intensity (light, normal, heavy, or very heavy), while

(f r e q \in {5, 10, 50, 100})

. These changes are applied to routes derived from instances in the TSPLib library [36], specifically kroA100, kroA150, and kroA200. These instances consist of 100, 150, and 200 cities, respectively, using Euclidean distance as the metric.

Our algorithm, incorporating new ideas (immigrant insertion), is compared with several state-of-the-art algorithms for dynamic optimization—namely ACO, PSO, GWO, and GA—all of which are enhanced with immigrant schemes specifically adapted to dynamic environments. Across the various instances, the performance is evaluated using the over offline performance (OOP) metric as defined in Jin and Branke [37]. This metric is defined as the sum of the measurements of the best solution found by the algorithm after detecting a dynamic change in each iteration (Equation (44)).

P_{o f f} = \frac{1}{I t e r} \sum_{i = 1}^{I t e r} (\frac{1}{R u n} \sum_{j = 1}^{R u n} P_{i j}^{*})

(44)

Equation (44) defines the offline performance metric

P_{o f f}

, which evaluates the average quality of the best solutions obtained over multiple runs and iterations. Specifically,

R u n

independent executions are carried out under identical dynamic scenarios, each for a total of

I t e r

iterations. At each iteration i of run j, the CO₂ consumption of the best solution found is denoted by

P_{i j}^{*}

. The inner summation computes the average best performance at iteration i across all runs, while the outer summation aggregates these values over the entire optimization horizon. This metric reflects the algorithm’s ability to consistently identify high-quality solutions over time in a dynamic environment.

For each algorithm-instance configuration, the reported offline performance (OOP) values in Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11 and Table 12 are computed as statistical averages over 30 independent runs, each consisting of 500 iterations. This double-averaging process ensures robust estimation of long-term algorithmic behavior and significantly reduces variance. Although explicit confidence intervals are not shown, this approach provides stable and representative performance metrics suitable for comparative analysis.

The comparatively higher OOP values observed for RI-GWO, RI-PSO, RI-ACO, and RI-GA may be attributed to specific algorithmic characteristics, such as limited population diversity, slower adaptation to environmental changes, or the absence of embedded local search mechanisms. These aspects can affect how effectively each algorithm maintains solution quality over time in dynamic environments. By contrast, the proposed method integrates adaptive learning components and strategic diversification mechanisms, which may explain its comparatively better OOP performance across multiple scenarios.

In this first experimental setup, we set the number of independent runs to

R u n = 30

, and the number of iterations to

I t e r = 500

for each run.

Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11 and Table 12 present the results of this comparison.

To better understand how the algorithms react to varying degrees of environmental change, we provide a detailed analysis of three representative instances: KroA100, KroA150, and KroA200. These instances were selected as they are typical of the small, medium, and large problem sizes in our benchmark set. Table 1 presents the results for the dynamic instance KroA100 under random traffic conditions with a change frequency

f r e q = 5

, and varying levels of change magnitude

\in {0.1, 0.25, 0.5, 0.75}

. The analysis focuses primarily on the offline performance (OOP), which serves as a key indicator of the algorithm’s adaptability and robustness in dynamic environments.

As shown in the table, the proposed OCO algorithm consistently outperforms all comparative approaches (RI-GWO, RI-ACO, RI-PSO, RI-GA) in terms of OOP across all magnitudes. At magnitude=

0.1

, OCO achieves an OOP of 21,290.9, which is significantly lower than that of RI-GWO (122,881.04), RI-ACO (36,957.6), RI-PSO (36,744.29), and RI-GA (59,296.5). This performance gap widens further as the change magnitude increases. In particular, when magnitude =

0.75

, OCO maintains an OOP of 20,677, effectively preserving the solution quality regardless of environmental severity. In contrast, all competing methods suffer from severe degradations; RI-GWO, for example, reaches an OOP of 125,660.76, while RI-PSO and RI-ACO exceed 36,000. These results highlight the superior reactivity and stability of OCO in dynamic scenarios, where the cost of poor adaptability is amplified over time. The ability of OCO to maintain low OOP values under different magnitudes suggests that it effectively integrates change detection and memory mechanisms, which allows it to track the moving optimum more efficiently than reactive-initialized metaheuristics (RI-GWO, RI-ACO, etc.). This confirms that OOP is a sensitive and appropriate metric to assess the dynamic behavior of algorithms in the context of green dynamic TSP. Moreover, OCO’s computational time remains within a reasonable range (approximately 22–25 s), offering a favorable trade-off between solution quality and runtime. By contrast, RI-GWO requires over 120 s with substantially poorer OOP results. In summary, the results from Table 1 clearly demonstrate that OCO exhibits a high degree of robustness and adaptability across varying levels of dynamism. Its superior OOP scores across all settings validate its effectiveness as a dynamic optimizer tailored for the Dynamic Green TSP.

To provide a comprehensive evaluation of the proposed OCO algorithm, the analysis was extended across multiple problem scales—namely KroA100, KroA150, and KroA200—under various settings of change frequency and change magnitude, thereby allowing for an in-depth assessment of its dynamic adaptability and performance stability.

The study of different tables (Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11 and Table 12) evaluates the performance across multiple dynamic scenarios with varying change frequencies (5, 10, 50, 100) and magnitudes (0.1 to 0.75).

To provide a deeper insight into the experimental results (Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11 and Table 12), we analyze the impacts of two key dynamic parameters—frequency and magnitude of change—on algorithmic performance, as well as the differences observed between RI-GA and our proposed OCO algorithm.

First, as the change frequency

f r e q

increases, the dynamic environment becomes more volatile, requiring algorithms to adapt more frequently to maintain solution quality. This is reflected by a general decrease in performance across all methods as f grows, especially for algorithms without memory or adaptive components. OCO mitigates this effect through online adjustment and memory-based mechanisms that enable fast adaptation to frequent changes.

Second, higher change magnitudes introduce more disruptive shifts in the problem landscape. As expected, all algorithms experience a drop in quality when the magnitude increases. However, OCO maintains better stability and responsiveness compared to RI-GA, which suffers from delayed convergence due to its population-based structure.

Interestingly, while RI-GA achieves slightly shorter tour lengths in some instances (e.g., Length = 20,802 vs. 21,535 for OCO, as shown in Table 1), this does not translate into a better overall dynamic behavior. The Offline Performance (OOP) with

Iter = 500

and

Run = 30

, captures the algorithm’s robustness over time. According to this metric, OCO significantly outperforms RI-GA (e.g., OOP = 21,911.1 for OCO vs. 60,145.1 for RI-GA, as shown in Table 1), demonstrating its superior ability to track the shifting optimum in dynamic environments.

These observations confirm that evaluating algorithms solely on static quality measures (e.g., best tour length) may lead to misleading conclusions in dynamic contexts. Instead, metrics like OOP provide a more meaningful and time-sensitive assessment of adaptive behavior, where OCO shows clear advantages.

Key Observations:

Tour Length Performance:
OCO consistently achieves superior performance with lower distances, maintaining stability even under high dynamics (e.g., magnitude = 0.75; frequency = 100; OCO = 26,786 vs. RI-GA = 64,870, as shown in Table 8).
Overall offline performance (OOP):
For instance, at a change magnitude of 0.1, OCO achieves a tour length of 29,347 and an OOP of 29,654.7, significantly better than those of RI-GWO (OOP: 250,479.01), RI-ACO (OOP: 57,187.85), RI-PSO (OOP: 57,009.21), and RI-GA (OOP: 177,192.2) as shown in Table 12. This trend remains stable across increasing dynamic intensities, highlighting OCO’s robustness and adaptability to dynamic environments with both low and high degrees of change.
Computational efficiency (expressed in seconds):
With respect to computational time, OCO incurs a higher computational cost than RI-ACO, RI-PSO, and RI-GA, yet remains substantially more efficient than RI-GWO. For example, at a change magnitude of 0.5, OCO’s execution time is approximately 333 s, whereas RI-GWO requires over 3000 s. While RI-ACO, RI-PSO, and RI-GA achieve faster runtimes (between 9 and 35 s), these are accompanied by significantly poorer solution quality in both tour length and OOP metrics.
This trade-off suggests that OCO achieves a superior balance between dynamic performance and computational efficiency, making it particularly suitable for scenarios where solution accuracy and adaptability to environmental changes are critical, such as in green and sustainable logistics contexts.

Dynamic Performance Synthesis:

Resilience: The OCO algorithm consistently maintains a high-quality solution across all levels of environmental dynamism. For example, as the change magnitude increases from 0.1 to 0.75, the OOP values for OCO range only slightly from 27,762 to 27,890.9, and the tour length fluctuates marginally between 26,495 and 27,526. In contrast, RI-GWO exhibits severe degradation in OOP—from 190,501.7 to 187,354.89 (as shown in Table 7)—with relatively minor improvement in solution quality (lengths between 28,414 and 27,430), suggesting poor adaptation to increasing change levels. Similarly, RI-GA’s OOP escalates from 122,030.8 at 0.1 to 121,889.3 at 0.75, with tour lengths deteriorating significantly (from 54,251 to 60,577), showing that its solutions become less effective under dynamic conditions.

Quality–computation trade-off: While OCO’s computational time is higher than that of RI-ACO, RI-PSO, and RI-GA (e.g., OCO: 121–160 s vs. RI-ACO: 17–19 s, RI-PSO: 21–25 s, RI-GA: 12–38 s), this overhead is justified by substantially better performance in both tour length and OOP. For instance, at a change magnitude of 0.5, OCO yields a tour length of 26,495 and an OOP of 26,973.5, compared to RI-GA’s length of 51,720 and an OOP of 125,504.5, or RI-ACO’s length of 43,177 and an OOP of 49,190.58 (as shown in Table 7). Importantly, OCO also outperforms RI-GWO while being six to seven times faster (e.g., 113 s vs. 734–872 s) and producing significantly lower OOP values (OCO: 27,000 vs. RI-GWO: 187,000), affirming its computational efficiency relative to high-cost algorithms.

These findings validate that OCO achieves a robust balance between solution quality and computational time (for example, the average computational time of RI-GWO is approximately 6.6 times higher than that of OCO) making it especially well-suited for deployment in dynamic and environmentally constrained routing scenarios. The performance stability across varying change magnitudes strongly suggests that the algorithm maintains population diversity while avoiding premature convergence. This robustness can be attributed to OCO’s three-phase learning architecture, which enhances exploration through diversification and exploitation via localized refinement, thereby effectively adapting to environmental fluctuations in dynamic Green TSP instances.

Figure 6 compares the average Offline Performance (OOP) of five optimization algorithms, with a focus on CO₂ emission minimization:

OCO achieves superior performance (OOP = 30,152), outperforming all baseline methods;
RI-GWO shows the weakest results (OOP = 254,934);
RI-ACO and RI-PSO demonstrate intermediate performance (57,000);
RI-GA performs better than RI-GWO but worse than OCO (OOP = 190,490).

The comparative analysis reveals substantial performance differentials among the evaluated algorithms:

Superiority of OCO: The algorithm demonstrates remarkable efficiency, achieving an 88% reduction in offline performance (OOP) compared to RI-GWO (30,152 vs. 254,934) and an 84% improvement over RI-GA. This significant margin suggests OCO’s enhanced capability to handle dynamic optimization constraints.
Intermediate Performers: RI-ACO and RI-PSO show comparable results, both being approximately 47% less efficient than OCO. Their similar performance profiles (57,000 OOP) indicate comparable limitations in adapting to environmental changes.
Relative underperformance: The substantial gap between OCO and RI-GWO (254,934 OOP) highlights fundamental limitations in the latter’s exploration–exploitation balance, particularly when addressing CO₂ minimization in dynamic scenarios.

These findings underscore OCO’s algorithmic advantages in

Dynamic parameter adaptation;
Robustness against environmental variability;
Sustainable optimization capability.

The performance hierarchy (OCO ≫ RI-ACO ≈ RI-PSO > RI-GA ≫ RI-GWO) remains consistent across all tested change frequencies and magnitudes, suggesting structural rather than contextual superiority.

The significant performance gaps suggest

OCO’s adaptive learning mechanism effectively handles dynamic environmental changes;
Traditional metaheuristics (RI-GWO, RI-GA) struggle with frequency and magnitude changes;
Population-based methods (RI-ACO, RI-PSO) show moderate adaptability.

6.2. Statistical Tests

In the field of metaheuristic algorithm comparison, statistical validation is essential to assess whether observed performance differences are significant or merely due to stochastic variability. Given the non-deterministic nature of metaheuristics and the lack of normality in performance distributions, non-parametric statistical tests are generally preferred [38,39]. In this study, we adopt a rigorous statistical testing procedure including the Wilcoxon signed-rank test, the Friedman test, and the Nemenyi post hoc test, to compare the proposed OCO algorithm against four reference methods (RI-GWO, RI-ACO, RI-PSO, RI-GA) across several performance indicators.

6.2.1. Wilcoxon Signed-Rank Test

The Wilcoxon signed-rank test is a non-parametric alternative to the paired t-test, designed to compare the performance of two algorithms over multiple problem instances. It evaluates whether the median difference between paired values is significantly different from zero, without assuming the normal distribution of the data. The test operates under the following hypotheses:

$H_{0}$ : The two algorithms have equivalent performance distributions (median of differences is zero).
$H_{1}$ : The performance distributions differ significantly (median of differences is non-zero).

For each pair of algorithms, the differences in performance are computed, ranked by their absolute values, and summed separately for positive and negative ranks. The test statistic W is the minimum of these two sums:

W = min (\sum_{i : d_{i} > 0} R_{i}, \sum_{j : d_{j} < 0} R_{j})

(45)

where

d_{i}

is the difference between paired performances and

R_{i}

is the rank of

| d_{i} |

. A small p-value (

p < α = 0.05

) indicates that the performance difference is statistically significant.

Results:

The Wilcoxon test was applied between OCO and each baseline method for three metrics: tour length, offline performance, and computation time. The results in Table 13 show statistically significant improvements of OCO over all baselines (

p < 0.001

in most cases).

These results confirm that OCO significantly outperforms RI-GWO, RI-ACO, RI-PSO, and RI-GA in terms of all considered metrics.

6.2.2. Friedman Test

To simultaneously compare multiple algorithms across several problem instances, we employ the Friedman test, a non-parametric alternative to repeated-measures Analysis of Variance (ANOVA). It tests the null hypothesis that all algorithms perform equally on average:

H_{0} : All algorithms are statistically equivalent H_{1} : At least one algorithm differs

(46)

The Friedman test statistic is calculated as

χ_{F}^{2} = \frac{12 N}{k (k + 1)} (\sum_{j = 1}^{k} R_{j}^{2} - \frac{k {(k + 1)}^{2}}{4})

(47)

where N is the number of problems, k is the number of algorithms, and

R_{j}

is the average rank of algorithm j across all instances.

Results:

The Friedman test was applied for tour length, offline performance, and computation time. As reported in Table 14, the test shows highly significant differences among the algorithms (

p \leq 5.3 \times 10^{- 12}

for all metrics).

6.2.3. Nemenyi Post Hoc Test

Since the Friedman test indicates that not all algorithms are equivalent, we proceed with the Nemenyi test, which performs pairwise comparisons while controlling the family-wise error rate. It identifies which algorithms differ significantly in rank.

Results:

Table 15 displays the p-values of pairwise comparisons between OCO and the other algorithms for each metric.

The results show that OCO significantly outperforms RI-ACO and RI-GA in most cases. Differences with RI-PSO and RI-GWO are less pronounced and not always statistically significant.

7. Conclusions

This research introduces an innovative and effective methodology for addressing the Dynamic Green TSP, which incorporates time-varying travel conditions and realistic roadway features (e.g., inclines), with the objective of reducing energy consumption and greenhouse gas emissions. The proposed OCO algorithm integrates combinatorial operators within an adaptive metaheuristic framework, enabling robust, low-complexity solutions for dynamic environments. Empirical validation through extensive simulations and case studies confirms significant improvements in fuel efficiency, tour duration, and computational performance compared to state-of-the-art methods, particularly in highly dynamic contexts. Our methodological contribution lies in the direct integration of permutation-based combinatorial operators into existing real-valued metaheuristics, eliminating the need for complex encoding transformations and enabling broad applicability across dynamic optimization problems such as DVRP and adaptive scheduling. Anticipating future endeavors in research, there will be an emphasis on augmenting this theoretical framework to encompass multi-agent systems and integrating it within multi-objective planning paradigms to more effectively address the intricate relationship between economic efficiency, environmental sustainability, and service quality.

Author Contributions

Conceptualization, K.J.; Resources, R.B.; Writing – review & editing, R.B. and M.T.; Supervision, K.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data available in a publicly accessible repository.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Osaba, E.; Yang, X.S.; Del Ser, J. Traveling salesman problem: A perspective review of recent research and new results with bio-inspired metaheuristics. In Nature-Inspired Computation and Swarm Intelligence; Elsevier: Amsterdam, The Netherlands, 2020; pp. 135–164. [Google Scholar]
Gao, W.; Luo, Z.; Shen, H. A branch-and-price-and-cut algorithm for time-dependent pollution routing problem. Transp. Res. Part C Emerg. Technol. 2023, 156, 104339. [Google Scholar] [CrossRef]
Ben Jelloun, R.; Jebari, K.; El Moujahid, A. Open Competency Optimization: A Human-Inspired Optimizer for the Dynamic Vehicle-Routing Problem. Algorithms 2024, 17, 449. [Google Scholar] [CrossRef]
Nicoletti, B.; Appolloni, A. Green Logistics 5.0: A review of sustainability-oriented innovation with foundation models in logistics. Eur. J. Innov. Manag. 2024, 27, 542–561. [Google Scholar] [CrossRef]
Wu, Y.; Wang, S.; Zhen, L.; Laporte, G. Integrating operations research into green logistics: A review. Front. Eng. Manag. 2023, 10, 517–533. [Google Scholar] [CrossRef]
Blanco, E.E.; Sheffi, Y. Green logistics. In Sustainable Supply Chains: A Research-Based Textbook on Operations and Strategy; Springer: Berlin/Heidelberg, Germany, 2024; pp. 101–141. [Google Scholar]
Zhang, M.; Sun, M.; Bi, D.; Liu, T. Green logistics development decision-making: Factor identification and hierarchical framework construction. IEEE Access 2020, 8, 127897–127912. [Google Scholar] [CrossRef]
Greco, F. Travelling Salesman Problem; In-teh: London, UK, 2002; pp. 75–115. [Google Scholar]
Halim, A.H.; Ismail, I. Combinatorial optimization: Comparison of heuristic algorithms in travelling salesman problem. Arch. Comput. Methods Eng. 2019, 26, 367–380. [Google Scholar] [CrossRef]
Moghdani, R.; Salimifard, K.; Demir, E.; Benyettou, A. The green vehicle routing problem: A systematic literature review. J. Clean. Prod. 2021, 279, 123691. [Google Scholar] [CrossRef]
Küçükoğlu, İ.; Ene, S.; Aksoy, A.; Öztürk, N. A memory structure adapted simulated annealing algorithm for a green vehicle routing problem. Environ. Sci. Pollut. Res. 2015, 22, 3279–3297. [Google Scholar] [CrossRef]
Úbeda, S.; Faulin, J.; Serrano, A.; Arcelus, F.J. Solving the green capacitated vehicle routing problem using a tabu search algorithm. Lect. Notes Manag. Sci. 2014, 6, 141–149. [Google Scholar]
Zhang, C.; Zhao, Y.; Leng, L. A hyper-heuristic algorithm for time-dependent green location routing problem with time windows. IEEE Access 2020, 8, 83092–83104. [Google Scholar] [CrossRef]
Rodríguez-Esparza, E.; Masegosa, A.D.; Oliva, D.; Onieva, E. A new hyper-heuristic based on adaptive simulated annealing and reinforcement learning for the capacitated electric vehicle routing problem. Expert Syst. Appl. 2024, 252, 124197. [Google Scholar] [CrossRef]
Zhang, S.; Gajpal, Y.; Appadoo, S. A meta-heuristic for capacitated green vehicle routing problem. Ann. Oper. Res. 2018, 269, 753–771. [Google Scholar] [CrossRef]
Andelmin, J.; Bartolini, E. A multi-start local search heuristic for the green vehicle routing problem based on a multigraph reformulation. Comput. Oper. Res. 2019, 109, 43–63. [Google Scholar] [CrossRef]
Zhao, W.; Bian, X.; Mei, X. An Adaptive Multi-Objective Genetic Algorithm for Solving Heterogeneous Green City Vehicle Routing Problem. Appl. Sci. 2024, 14, 6594. [Google Scholar] [CrossRef]
Micale, R.; Marannano, G.; Giallanza, A.; Miglietta, P.; Agnusdei, G.; La Scalia, G. Sustainable vehicle routing based on firefly algorithm and TOPSIS methodology. Sustain. Futur. 2019, 1, 100001. [Google Scholar] [CrossRef]
Bektaş, T.; Laporte, G. The pollution-routing problem. Transp. Res. Part B Methodol. 2011, 45, 1232–1250. [Google Scholar] [CrossRef]
Guntsch, M.; Middendorf, M.; Schmeck, H. An ant colony optimization approach to dynamic TSP. In Proceedings of the 3rd Annual Conference on Genetic and Evolutionary Computation, Málaga, Spain, 14–18 July 2025; Morgan Kaufmann Publishers Inc.: Burlington, MA, USA, 2001; pp. 860–867. [Google Scholar]
Guntsch, M.; Middendorf, M. Pheromone modification strategies for ant algorithms applied to dynamic TSP. In Proceedings of the Applications of Evolutionary Computing, Como, Italy, 18–20 April 2001; Springer: Berlin/Heidelberg, Germany, 2001. [Google Scholar]
Psaraftis, H.N. Dynamic vehicle routing: Status and prospects. Ann. Oper. Res. 1995, 61, 143–164. [Google Scholar] [CrossRef]
Stodola, P.; Michenka, K.; Nohel, J.; Rybanskỳ, M. Hybrid algorithm based on ant colony optimization and simulated annealing applied to the dynamic traveling salesman problem. Entropy 2020, 22, 884. [Google Scholar] [CrossRef]
Kang, L.; Zhou, A.; McKay, B.; Li, Y.; Kang, Z. Benchmarking algorithms for dynamic travelling salesman problems. In Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753), Portland, OR, USA, 19–23 June 2004; Volume 2, pp. 1286–1292. [Google Scholar]
Younes, A.; Basir, O.; Calamai, P. Adaptive control of genetic parameters for dynamic combinatorial problems. In Metaheuristics: Progress in Complex Systems Optimization; Springer: Berlin/Heidelberg, Germany, 2007; pp. 205–223. [Google Scholar]
Mavrovouniotis, M.; Yang, S.; Yao, X. A benchmark generator for dynamic permutation-encoded problems. In Proceedings of the International Conference on Parallel Problem Solving from Nature, Taormina, Italy, 1–5 September 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 508–517. [Google Scholar]
Mavrovouniotis, M.; Yang, S. Ant algorithms with immigrants schemes for the dynamic vehicle routing problem. Inf. Sci. 2015, 294, 456–477. [Google Scholar] [CrossRef]
Touzout, F.A.; Ladier, A.L.; Hadj-Hamou, K. An assign-and-route matheuristic for the time-dependent inventory routing problem. Eur. J. Oper. Res. 2022, 300, 1081–1097. [Google Scholar] [CrossRef]
Asghari, M.; Mirzapour Al-e-hashem, S.M.J. Green vehicle routing problem: A state-of-the-art review. Int. J. Prod. Econ. 2021, 231, 107899. [Google Scholar] [CrossRef]
Daşcıoğlu, B.G.; Yazgan, H.R. Dynamic green location and routing problem for service points. Int. J. Procure. Manag. 2020, 13, 112–133. [Google Scholar] [CrossRef]
Gao, Z.; Xu, X.; Hu, Y.; Wang, H.; Zhou, C.; Zhang, H. Based on improved NSGA-II algorithm for solving time-dependent green vehicle routing problem of urban waste removal with the consideration of traffic congestion: A case study in China. Systems 2023, 11, 173. [Google Scholar] [CrossRef]
Çimen, M.; Soysal, M. Time-dependent green vehicle routing problem with stochastic vehicle speeds: An approximate dynamic programming algorithm. Transp. Res. Part Transp. Environ. 2017, 54, 82–98. [Google Scholar] [CrossRef]
Dündar, H.; Soysal, M.; Ömürgönülşen, M.; Kanellopoulos, A. A green dynamic TSP with detailed road gradient dependent fuel consumption estimation. Comput. Ind. Eng. 2022, 168, 108024. [Google Scholar] [CrossRef]
Lu, Y.; Yuan, Y.; Yasenjiang, J.; Sitahong, A.; Chao, Y.; Wang, Y. An Optimized Method for Solving the Green Permutation Flow Shop Scheduling Problem Using a Combination of Deep Reinforcement Learning and Improved Genetic Algorithm. Mathematics 2025, 13, 545. [Google Scholar] [CrossRef]
Zhang, P.; Wang, J.; Tian, Z.; Sun, S.; Li, J.; Yang, J. A genetic algorithm with jumping gene and heuristic operators for traveling salesman problem. Appl. Soft Comput. 2022, 127, 109339. [Google Scholar] [CrossRef]
TSPLIB95. 2025. Available online: http://comopt.ifi.uni-heidelberg.de/software/TSPLIB95/tsp/ (accessed on 10 April 2025).
Jin, Y.; Branke, J. Evolutionary optimization in uncertain environments-a survey. IEEE Trans. Evol. Comput. 2005, 9, 303–317. [Google Scholar] [CrossRef]
García, S.; Fernández, A.; Luengo, J.; Herrera, F. Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Inf. Sci. 2010, 180, 2044–2064. [Google Scholar] [CrossRef]
Derrac, J.; García, S.; Molina, D.; Herrera, F. A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol. Comput. 2011, 1, 3–18. [Google Scholar] [CrossRef]

Figure 3. Visualization of the TSP tour before and after applying ⊕ combinatorial operators.

Figure 4. Visualization of the TSP tour before and after applying ⊗ combinatorial operators.

Figure 5. Transformation functions.

Figure 6. Overall offline performance.

Table 1. Instance KroA100 with random traffic and freq = 5.

Algorithms	Change_Magnitude	Length	OOP	Computational_Time
OCO	0.1	21,059	21,290.9	22.3921022415161
RI-GWO	0.1	21,949	122,881.04	127.165235280991
RI-ACO	0.1	32,994	36,957.6	8.11824822425842
RI-PSO	0.1	21,701	36,744.29	9.33303236961365
RI-GA	0.1	21,251	59,296.5	15.7245984077454
OCO	0.25	21,535	21,911.1	23.079371213913
RI-GWO	0.25	23,035	123,244.58	128.834784030914
RI-ACO	0.25	31,928	36,809.04	8.31470632553101
RI-PSO	0.25	23,224	37,126.01	9.31120896339417
RI-GA	0.25	20,802	60,145.1	17.3999660015106
OCO	0.5	20,492	20,775.9	25.1404733657837
RI-GWO	0.5	21,298	126,085.52	132.616602420807
RI-ACO	0.5	32,951	36,863.78	8.51267838478088
RI-PSO	0.5	23,116	36,728.87	9.72644925117493
RI-GA	0.5	21,795	54,423.2	16.6300027370453
OCO	0.75	20,677	20,677	24.9908838272095
RI-GWO	0.75	21,709	125,660.76	121.951491594315
RI-ACO	0.75	32,767	36,773.7	9.96449542045593
RI-PSO	0.75	21,595	37,064.6	10.0018041133881
RI-GA	0.75	21,853	57,858.2	12.9152412414551

Table 2. Instance KroA100 with random traffic and freq = 10.

Algorithms	Change_Magnitude	Length	OOP	Computational_Time
OCO	0.1	21,213	21,755.4	20.9276142120361
RI-GWO	0.1	22,415	123,082	141.25444483757
RI-ACO	0.1	32,642	36,868.92	8.4200701713562
RI-PSO	0.1	23,366	36,433.43	9.23282861709595
RI-GA	0.1	22,261	54,110.1	12.2368183135986
OCO	0.25	20,941	21,319.1	19.2564599514008
RI-GWO	0.25	22,445	121,673.9	135.933153390884
RI-ACO	0.25	32,025	37,099.71	8.23803329467773
RI-PSO	0.25	21,938	36,932.47	9.66085481643677
RI-GA	0.25	21,944	60,938.4	17.2285289764404
OCO	0.5	21,113	21,733.4	17.0954973697662
RI-GWO	0.5	22,685	124,461.43	125.362582683563
RI-ACO	0.5	32,468	37,053.77	8.51956224441528
RI-PSO	0.5	22,375	36,750.32	9.71863651275635
RI-GA	0.5	21,272	65,360	16.1961874961853
OCO	0.75	21,600	21,698.6	20.8990180492401
RI-GWO	0.75	22,810	122,989.42	128.998858451843
RI-ACO	0.75	31,587	37,074.31	8.57306122779846
RI-PSO	0.75	22,787	37,073.63	9.89743041992188
RI-GA	0.75	21,332	55,110	11.2393550872803

Table 3. Instance KroA100 with random traffic and freq = 50.

Algorithms	Change_Magnitude	Length	OOP	Computational_Time
OCO	0.1	21,381	21,719.8	13.9666087627411
RI-GWO	0.1	22,508	125,408.07	131.110047101975
RI-ACO	0.1	32,384	37,043.3	8.42682528495789
RI-PSO	0.1	22,703	36,992.84	9.46939182281494
RI-GA	0.1	22,297	57,462.1	13.7826321125031
OCO	0.25	21,123	21,356.2	21.1810195446014
RI-GWO	0.25	22,165	124,025.48	134.562022924423
RI-ACO	0.25	33,515	36,917.27	8.55448985099793
RI-PSO	0.25	22,197	37,195.98	10.2855339050293
RI-GA	0.25	21,237	59,372.9	11.4311144351959
OCO	0.5	20,996	21,833.3	24.4424633979797
RI-GWO	0.5	21,806	123,893.6	127.856323480606
RI-ACO	0.5	32,071	36,921.34	8.39604187011719
RI-PSO	0.5	23,580	36,845.21	10.1386339664459
RI-GA	0.5	20,826	60,281.1	13.3048303127289
OCO	0.75	21,498	21,900	25.3512442111969
RI-GWO	0.75	22,563	125,067.34	121.043132781982
RI-ACO	0.75	33,274	37,134.15	8.59075498580933
RI-PSO	0.75	21,980	36,792.84	9.66982936859131
RI-GA	0.75	21,072	51,068.4	12.845165014267

Table 4. Instance KroA100 with random traffic and freq = 100.

Algorithms	Change_Magnitude	Length	OOP	Computational_Time
OCO	0.1	21,432	21,640.4	24.6619594097138
RI-GWO	0.1	21,847	125,376.59	122.259998083115
RI-ACO	0.1	32,999	37,034.76	8.71673130989075
RI-PSO	0.1	22,890	36,705.15	9.7340042591095
RI-GA	0.1	21,204	58,763	15.9396677017212
OCO	0.25	20,933	21,222.7	18.1225333213806
RI-GWO	0.25	22,386	120,743.46	132.594746589661
RI-ACO	0.25	32,153	36,821.59	8.32460141181946
RI-PSO	0.25	22,932	37,105.08	9.6085958480835
RI-GA	0.25	20,694	56,726.3	13.8298809528351
OCO	0.5	20,738	21,019.8	21.4589228630066
RI-GWO	0.5	22,282	124,836.81	130.818113327026
RI-ACO	0.5	31,304	36,751.17	8.78234338760376
RI-PSO	0.5	21,863	36,829.68	10.0047512054443
RI-GA	0.5	20,917	61,604.2	14.5410196781158
OCO	0.75	20,827	21,206.8	23.4733896255493
RI-GWO	0.75	21,874	123,096.34	128.855330705643
RI-ACO	0.75	32,790	36,805.24	9.08334875106812
RI-PSO	0.75	22,173	36,951.24	10.3831281661987
RI-GA	0.75	21,746	58,370.4	24.3120296001434

Table 5. Instance KroA150 with random traffic and freq = 5.

Algorithms	Change_Magnitude	Length	OOP	Computation_Time
OCO	0.1	27,514	27,537.4	97.709445476532
RI-GWO	0.1	27,691	192,926.74	808.249512910843
RI-ACO	0.1	44,471	48,827.62	14.8391666412354
RI-PSO	0.1	27,883	48,902.14	18.997878074646
RI-GA	0.1	51,134	115,868.41	24.0980129241943
OCO	0.25	27,539	28,145.8	100.827348232269
RI-GWO	0.25	28,224	198,925.63	761.52139878273
RI-ACO	0.25	44,691	49,477.85	15.3931782245636
RI-PSO	0.25	27,485	49,082	20.4046738147736
RI-GA	0.25	63,970	119,283.8	18.8867592811585
OCO	0.5	26,385	27,051	107.441903352737
RI-GWO	0.5	27,061	183,743.64	685.366256713867
RI-ACO	0.5	43,174	48,899.89	15.3056991100311
RI-PSO	0.5	29,019	49,057.83	18.8881287574768
RI-GA	0.5	66,929	125,583.8	21.7121329307556
OCO	0.75	27,079	27,404.5	123.916888237
RI-GWO	0.75	28,196	193,147.6	763.206788778305
RI-ACO	0.75	41,112	49,126.15	15.0603370666504
RI-PSO	0.75	27,187	49,133.53	20.5533874034882
RI-GA	0.75	64,776	122,469.6	48.262743473053

Table 6. Instance KroA150 with random traffic and freq = 10.

Algorithms	Change_Magnitude	Length	OOP	Computation_Time
OCO	0.1	26,379	26,659.9	97.8753378391266
RI-GWO	0.1	27,858	193,429.5	714.04919552803
RI-ACO	0.1	44,118	48,824.16	14.9826173782349
RI-PSO	0.1	28,386	48,637.3	20.6628074645996
RI-GA	0.1	57,195	117,651	14.7551081180573
OCO	0.25	26,842	27,165.5	126.872204780579
RI-GWO	0.25	27,508	189,454.85	1025.27630829811
RI-ACO	0.25	44,301	48,997.67	17.3445875644684
RI-PSO	0.25	27,794	49,175.62	22.2102901935577
RI-GA	0.25	58,335	112,912.1	34.976592540741
OCO	0.5	26,542	26,662.7	134.05682182312
RI-GWO	0.5	27,838	190,610.73	1075.06168675423
RI-ACO	0.5	42,873	48,877.36	18.8859694004059
RI-PSO	0.5	28,643	49,179.19	22.0664458274841
RI-GA	0.5	79,180	124,260.7	46.1346783638001
OCO	0.75	27,818	27,818	152.461178541184
RI-GWO	0.75	28,778	190,314.64	884.227702140808
RI-ACO	0.75	43,797	49,108.45	18.1741693019867
RI-PSO	0.75	28,550	49,162.81	22.0895233154297
RI-GA	0.75	64,176	121,569.8	20.1959164142609

Table 7. Instance KroA150 with random traffic and freq = 50.

Algorithms	Change_Magnitude	Length	OOP	Computation_Time
OCO	0.1	27,526	27,762	121.465321063995
RI-GWO	0.1	28,414	190,501.7	830.968395471573
RI-ACO	0.1	44,371	49,287.28	17.7421343326569
RI-PSO	0.1	28,011	48,972.69	21.2658190727234
RI-GA	0.1	54,251	122,030.8	38.3366434574127
OCO	0.25	26,849	27,048.1	160.889377355576
RI-GWO	0.25	27,152	187,012.87	852.514069080353
RI-ACO	0.25	42,296	49,089.16	17.8825562000275
RI-PSO	0.25	28,028	49,131.98	22.7740960121155
RI-GA	0.25	57,388	120,709.2	23.9269785881043
OCO	0.5	26,495	26,973.5	113.92518901825
RI-GWO	0.5	27,340	187,346.31	734.989396572113
RI-ACO	0.5	43,177	49,190.58	19.0754761695862
RI-PSO	0.5	29,017	49,260.82	24.9999492168427
RI-GA	0.5	51,720	125,504.5	12.8652136325836
OCO	0.75	26,691	27,890.9	124.87078166008
RI-GWO	0.75	27,430	187,354.89	872.875902891159
RI-ACO	0.75	44,236	49,154.95	17.7652485370636
RI-PSO	0.75	28,569	49,329.81	22.5915353298187
RI-GA	0.75	60,577	121,889.3	22.4923396110535

Table 8. Instance KroA150 with random traffic and freq=100.

Algorithms	Change_Magnitude	Length	OOP	Computation_Time
OCO	0.1	26,483	26,967.1	151.894057273865
RI-GWO	0.1	27,697	180,809.23	901.101784229279
RI-ACO	0.1	44,994	49,155.87	17.162840127945
RI-PSO	0.1	28,527	49,025.66	21.9899196624756
RI-GA	0.1	60,838	120,722	32.3962223529816
OCO	0.25	27,618	27,912.8	158.637745141983
RI-GWO	0.25	27,085	186,486.95	821.483963251114
RI-ACO	0.25	44,502	49,334.34	17.9162790775299
RI-PSO	0.25	28,052	49,130.45	22.5717957019806
RI-GA	0.25	68,551	136,928	33.0491642951965
OCO	0.5	26,634	26,822.1	106.701067209244
RI-GWO	0.5	27,528	188,537.37	842.777728319168
RI-ACO	0.5	44,130	49,125.67	17.9808030128479
RI-PSO	0.5	27,307	49,025.28	21.8809659481049
RI-GA	0.5	58,353	117,597.2	30.5044939517975
OCO	0.75	26,786	26,896.5	142.47221159935
RI-GWO	0.75	27,566	189,727.8	914.117618083954
RI-ACO	0.75	42,768	49,019.32	18.6860890388489
RI-PSO	0.75	28,928	49,119.65	21.5470142364502
RI-GA	0.75	64,870	127,442	27.0094890594482

Table 9. Instance KroA200 with random traffic and freq = 5.

Algorithms	Change_Magnitude	Length	OOP	Computation_Time
OCO	0.1	29,944	30,206.1	359.626891613007
RI-GWO	0.1	30,594	249,415.27	2679.29984951019
RI-ACO	0.1	52,040	57,023.46	21.3514449596405
RI-PSO	0.1	31,233	57,408.64	29.1911859512329
RI-GA	0.1	127,401	188,598.4	20.3655452728272
OCO	0.25	30,348	30,588	297.62440609932
RI-GWO	0.25	30,245	255,258.47	2495.94751191139
RI-ACO	0.25	51,166	57,067.17	21.998370885849
RI-PSO	0.25	31,537	57,454.16	31.5070748329163
RI-GA	0.25	120,371	188,276.66	13.6076040267944
OCO	0.5	29,506	29,820.4	353.145201206207
RI-GWO	0.5	30,662	248,967.74	2924.53922724724
RI-ACO	0.5	51,043	56,992.95	21.9690067768097
RI-PSO	0.5	31,309	57,256.14	31.3970522880554
RI-GA	0.5	108,760	190,927.5	8.91393232345581
OCO	0.75	29,971	30,516.5	410.367824316025
RI-GWO	0.75	29,946	251,574.78	2558.51860165596
RI-ACO	0.75	52,456	57,456.91	22.2017307281494
RI-PSO	0.75	31,088	57,208.24	31.7390558719635
RI-GA	0.75	104,103	179,700.8	10.56822681427

Table 10. Instance KroA200 with random traffic and freq = 10.

Algorithms	Change_Magnitude	Length	OOP	Computation_Time
OCO	0.1	29,713	30,080	319.550868988037
RI-GWO	0.1	30,343	259,596.23	2727.19788074493
RI-ACO	0.1	52,371	57,396.37	22.1682703495026
RI-PSO	0.1	31,336	57,121.05	33.6118106842041
RI-GA	0.1	147,503	206,709.31	40.0141541957855
OCO	0.25	29,328	29,523	362.919169902802
RI-GWO	0.25	30,347	253,222.11	2659.1444709301
RI-ACO	0.25	52,314	57,486.26	21.7984554767609
RI-PSO	0.25	32,082	57,334.82	29.6725142002106
RI-GA	0.25	120,751	189,406.3	8.541428565979
OCO	0.5	30,681	30,876	305.512933254242
RI-GWO	0.5	30,888	254,072.17	2496.83245229721
RI-ACO	0.5	53,070	57,826.88	22.5814046859741
RI-PSO	0.5	31,825	57,389.25	30.6847245693207
RI-GA	0.5	115,006	188,956.3	17.9437294006348
OCO	0.75	29,516	29,894.3	415.858951568604
RI-GWO	0.75	30,352	255,688.22	2832.66897535324
RI-ACO	0.75	50,243	57,443.92	22.2953035831451
RI-PSO	0.75	31,063	57,491.21	30.4808707237244
RI-GA	0.75	131,803	200,674.7	18.042248249054

Table 11. Instance KroA200 with random traffic and freq = 50.

Algorithms	Change_Magnitude	Length	OOP	Computation_Time
OCO	0.1	29,210	29,729.5	451.447427988052
RI-GWO	0.1	30,854	257,405.18	2592.22149825096
RI-ACO	0.1	50,964	57,112.58	21.6153562068939
RI-PSO	0.1	30,811	57,450.28	32.6675596237183
RI-GA	0.1	104,405	184,189.1	19.2394804954529
OCO	0.25	29,042	29,800.8	415.193927288055
RI-GWO	0.25	31,585	257,293.01	2587.61305117607
RI-ACO	0.25	52,744	57,805.21	22.3461935520172
RI-PSO	0.25	31,216	57,549.1	30.3804461956024
RI-GA	0.25	118,791	191,227.6	10.8311760425568
OCO	0.5	30,159	30,541.2	338.287764072418
RI-GWO	0.5	31,024	258,235.32	3005.51641082764
RI-ACO	0.5	51,174	57,097.42	23.0797967910767
RI-PSO	0.5	30,733	57,288.76	32.4898002147675
RI-GA	0.5	139,844	196,514.9	29.3715693950653
OCO	0.75	30,773	31,107.1	353.920367956162
RI-GWO	0.75	30,745	256,092.13	2695.67001247406
RI-ACO	0.75	49,638	57,178.93	23.6201846599579
RI-PSO	0.75	31,856	57,250.21	31.6116788387299
RI-GA	0.75	103,571	182,581.4	12.4816913604736

Table 12. Instance KroA200 with random traffic and freq = 100.

Algorithms	Change_Magnitude	Length	OOP	Computation_Time
OCO	0.1	29,347	29,654.7	459.092036247253
RI-GWO	0.1	30,037	250,479.01	3030.58809566498
RI-ACO	0.1	50,683	57,187.85	23.9921379089356
RI-PSO	0.1	31,676	57,009.21	35.8889374732971
RI-GA	0.1	103,736	177,192.2	19.6413018703461
OCO	0.25	29,219	29,821.6	499.806445360184
RI-GWO	0.25	30,100	252,264.76	2929.30351305008
RI-ACO	0.25	53,362	57,415.55	23.0371935367584
RI-PSO	0.25	32,896	57,003.6	30.6185252666473
RI-GA	0.25	121,289	193,924.8	24.688747882843
OCO	0.5	29,367	30,011.7	333.434591770172
RI-GWO	0.5	30,207	261,022.39	3064.4821677208
RI-ACO	0.5	51,705	57,332.88	23.3675134181976
RI-PSO	0.5	31,396	57,218.29	34.7962794303894
RI-GA	0.5	110,632	195,980.8	9.07850098609924
OCO	0.75	29,658	30,261.6	478.82949590683
RI-GWO	0.75	31,108	258,356.64	2811.53076457977
RI-ACO	0.75	52,296	57,220.32	23.4361596107483
RI-PSO	0.75	30,644	56,953.96	33.5193908214569
RI-GA	0.75	132,413	192,981.3	24.235230922699

Table 13. Wilcoxon signed-rank test (

α = 0.05

).

Table 13. Wilcoxon signed-rank test (

α = 0.05

).

Metric	Comparison	p-Value	Significant
Tour length	OCO vs. RI-GWO	0.00015	Yes
Tour length	OCO vs. RI-ACO	0.00003	Yes
Tour length	OCO vs. RI-PSO	0.00006	Yes
Tour length	OCO vs. RI-GA	0.00003	Yes
Offline performance	OCO vs. RI-GWO	0.00003	Yes
Offline performance	OCO vs. RI-ACO	0.00003	Yes
Offline performance	OCO vs. RI-PSO	0.00003	Yes
Offline performance	OCO vs. RI-GA	0.00003	Yes
Computation time	OCO vs. RI-GWO	0.00003	Yes
Computation time	OCO vs. RI-ACO	0.00003	Yes
Computation time	OCO vs. RI-PSO	0.00003	Yes
Computation time	OCO vs. RI-GA	0.00003	Yes

Table 14. Friedman test results.

Metric	$χ^{2}$	p-Value	Significant
Tour length	59	$4.7 \times 10^{- 12}$	Yes
Offline performance	61	$1.79 \times 10^{- 12}$	Yes
Computation time	58.75	$5.3 \times 10^{- 12}$	Yes

Table 15. Nemenyi post hoc test (

α = 0.05

).

Table 15. Nemenyi post hoc test (

α = 0.05

).

Metric	Comparison	p-Value
Tour length	OCO vs. RI-ACO	$2.69 \times 10^{- 6}$
Tour length	OCO vs. RI-GA	$4.15 \times 10^{- 11}$
Tour length	OCO vs. RI-GWO	0.259
Tour length	OCO vs. RI-PSO	0.056
Offline performance	OCO vs. RI-ACO	$7.99 \times 10^{- 7}$
Offline performance	OCO vs. RI-GA	0.030
Offline performance	OCO vs. RI-GWO	0.10
Offline performance	OCO vs. RI-PSO	$8.34 \times 10^{- 12}$
Computation time	OCO vs. RI-ACO	0.076
Computation time	OCO vs. RI-GA	0.021
Computation time	OCO vs. RI-GWO	$2.69 \times 10^{- 6}$
Computation time	OCO vs. RI-PSO	0.380

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Benjelloun, R.; Tarik, M.; Jebari, K. Open Competency Optimization with Combinatorial Operators for the Dynamic Green Traveling Salesman Problem. Information 2025, 16, 675. https://doi.org/10.3390/info16080675

AMA Style

Benjelloun R, Tarik M, Jebari K. Open Competency Optimization with Combinatorial Operators for the Dynamic Green Traveling Salesman Problem. Information. 2025; 16(8):675. https://doi.org/10.3390/info16080675

Chicago/Turabian Style

Benjelloun, Rim, Mouna Tarik, and Khalid Jebari. 2025. "Open Competency Optimization with Combinatorial Operators for the Dynamic Green Traveling Salesman Problem" Information 16, no. 8: 675. https://doi.org/10.3390/info16080675

APA Style

Benjelloun, R., Tarik, M., & Jebari, K. (2025). Open Competency Optimization with Combinatorial Operators for the Dynamic Green Traveling Salesman Problem. Information, 16(8), 675. https://doi.org/10.3390/info16080675

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Open Competency Optimization with Combinatorial Operators for the Dynamic Green Traveling Salesman Problem

Abstract

1. Introduction

2. Green Logistics

3. Dynamic Green Logistics

3.1. Formulation of the DTSP

3.2. Integrating Real-World Constraints into DG-TSP for Green Logistics

4. Open Competency Optimization

4.1. Self-Learning

4.2. Neighbor Learner Groups

4.3. Leadership Interaction

5. Operators for the TSP

5.1. Addition ⊕

5.2. Multiplication ⊗

5.3. Subtraction ⊖

5.4. Scalar Multiplication ⊙

5.5. Rationale and Advantages

6. Experimental Studies

6.1. Comparison with Other Metaheuristics

6.2. Statistical Tests

6.2.1. Wilcoxon Signed-Rank Test

6.2.2. Friedman Test

6.2.3. Nemenyi Post Hoc Test

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI