4.2.2. Grey Wolf Population Optimization Mechanism
The specific process of the multi-objective grey wolf optimization mechanism is shown in Algorithm 2. The algorithm adopts a dynamic selection strategy based on the Pareto dominance relationship: first, the population is divided into different levels through fast non-dominated sorting, and then the leader wolves are selected from the non-dominated solution set of the highest level according to specific optimization objectives. Specifically: the wolf corresponds to the solution with the least number of vehicles, reflecting the pursuit of minimizing operational costs; the wolf corresponds to the solution with the shortest path distance, dedicated to optimizing travel efficiency; the wolf selects the solution with the highest congestion degree, maintaining population diversity by calculating the neighborhood density in the objective space. During the search process, the leader wolves are updated using an elite retention strategy. After each generation of optimization, the algorithm combines the current population with the external archive, re-executes the non-dominated sorting, and updates the three-level leader wolves. This mechanism ensures the inheritance of high-quality solutions and effectively avoids premature convergence through the diversity maintenance function of the wolf. In terms of encoding design, an intuitive integer sequence encoding method is adopted, where each solution is represented as an ordered sequence containing depot nodes (0) and customer point numbers. For example, the encoding represents two travel costs: the first vehicle’s path is depot → customer point 3 → customer point 5 → depot; the second vehicle’s path is depot → customer point 2 → customer point 4 → depot. This encoding method fully retains the topological structure and sequence relationship of the path, facilitating subsequent path optimization.
The design of the fitness function adopts a hybrid evaluation mechanism based on Pareto rank. For the selection of leader wolves, independent evaluations are conducted for the three optimization objectives: the fitness of the wolf mainly considers the number of vehicles; the wolf focuses on the path distance; and the wolf emphasizes the calculation of congestion degree. For ordinary individuals, a weighted sum method is used to convert multi-objectives into a single objective for comprehensive evaluation, where the weight coefficients are dynamically adjusted with the number of iterations. This dynamic adjustment strategy is reflected in: in the early stage of optimization, slight constraint violations are allowed to expand the search space, while in the later stage, the constraint conditions are strengthened to guide the population towards the feasible region. Specifically, the capacity constraint penalty is achieved by accumulating the overload amount: when the vehicle load exceeds the rated capacity, the penalty value is equal to the weighted sum of the overloaded part; the time window penalty quantifies the deviation between the service time and the required time window of the customer point.
As shown in
Figure 2, the algorithm realizes the collaborative update of the population position through the leadership wolf guidance mechanism. In each generation of optimization, the
,
, and
three leading wolves respectively represent the elite solutions in different optimization directions. Ordinary wolves accept the guidance of the leading wolves with a 70% probability and update their positions through sequential crossover operations. The position update strategy reflects the intelligent characteristics of the grey wolf group’s collaborative hunting: in the encirclement stage, the three leading wolves respectively guide the search direction from the dimensions of the number of vehicles, travel costs, and population diversity; in the attack stage, new solutions are generated through crossover operations and only feasible solutions are strictly retained. This mechanism not only ensures population diversity but also enables continuous convergence toward high-quality solution regions. The iteration is terminated after 50 consecutive generations, effectively balancing the solution efficiency and quality.
| Algorithm 2: Multi-objective Grey Wolf Optimization Mechanism. |
| Require: graph: VrptwGraph |
| | - 1:
num_wolves = 30 - 2:
max_iter = 50 - 3:
archive_size = 20 - 4:
= 0.6, = 0.4 - 5:
_init = 0.1
|
| Ensure: Pareto_front: List[(vehicle_num, total_distance)] |
| | - 6:
Non-dominated solution set: - 7:
Wolf pack position: Randomly generate feasible solutions - 8:
Fitness evaluation: Calculate two objective values for each solution - 9:
Non-dominated solution set Wolf, Wolf, Wolf - 10:
Pareto frontier archive = [] - 11:
Fitness assessment: - 12:
Calculate the multi-objective fitness for each solution: - 13:
fitness = × vehicle_num + × total_distance + × penalty - 14:
penalty = Volume violation amount + Quantity of time window violations - 15:
= × (1 + iter/max_iter) # Dynamic penalty coefficient - 16:
while iter ≤ max_iter do - 17:
a. To each wolf: - 18:
i. According to the location information of , , update its own position - 19:
ii. When generating new solutions, apply the path repair strategy. - 20:
b. Calculate the Pareto frontier of all solutions: - 21:
Use non-dominated sorting to filter the solutions in the current population and the archive. - 22:
Calculate the congestion degree of the solution (to ensure the diversity of the solution) - 23:
Update the archive to the top archive_size optimal solutions. - 24:
c. Update Wolf, Wolf, Wolf - 25:
Wolf = The solution with the smallest number of vehicles in the archive. - 26:
Wolf = The solution with the smallest total distance in the archive. - 27:
Wolf = The solution with the highest congestion in the archive. - 28:
d. Dynamically adjust parameters: - 29:
The convergence factor a linearly decreases from 2 to 0. - 30:
Random vector r1, r2 ∈ [0, 1] - 31:
end while - 32:
return archive # Output the Pareto optimal solution set
|
As shown in
Figure 3, the solution is sequentially crossed in the form of path recombination to simulate the cooperative search behavior of wolf packs. The crossover process first randomly selects a path split point, which is usually located at the customer node rather than the warehouse node, to ensure the integrity and feasibility of the path segments. Specifically, the selection of the split point must meet two conditions: one is to avoid disrupting the continuity of the single vehicle path, and the other is to ensure that each sub-path after segmentation still meets the vehicle capacity constraints. The segment inheritance operation achieves knowledge transfer by integrating high-quality segments from different paths: the ordinary wolf C follows the leader wolf A [0, 1, 2, 0, 3, 4, 0] that seeks the minimum number of vehicles and the leader wolf B [0, 5, 6, 0, 7, 8, 0] that seeks the lowest cost for hunting, and inherits the paths of the two leader wolves. It combines the front segment path [0, 1, 2] of leader wolf A with the back segment path [0, 7, 8, 0] of the other leader wolf B to form a new hunting path [0, 1, 2, 0, 7, 8, 0]. Since direct combination may lead to repeated or missing visits to customer points, conflict detection and resolution must be carried out subsequently: repeated customer numbers are identified through the node mapping table, and conflicts are eliminated using the nearest neighbor replacement strategy, ultimately generating the optimized path [0, 1, 2, 5, 0, 7, 8, 6, 0]. This crossover strategy not only maintains the legitimacy of the solution structure but also effectively improves the search efficiency through the inheritance of elite path segments.
As shown in
Figure 4, the mutation operation is carried out in the form of solution reset, and a complete reconstruction strategy is adopted to simulate the exploration behavior of the wolf pack, thereby maintaining the diversity of the population. When a wolf individual does not accept the guidance of the leading wolf (with a trigger probability of 30%), the system initiates an intelligent reconstruction mechanism to generate a new path. The reconstruction process is divided into three stages: Firstly, the hunting is completely randomly rearranged to [1, 2, 3, 4], breaking the original hunting path [0, 1, 2, 0, 3, 4, 0]; then, it is sorted in descending order of urgency [4, 2, 1, 3] to ensure that the places with higher urgency are reached first; finally, the path is segmented based on the vehicle capacity as a dynamic threshold. When the cumulative demand exceeds the capacity constraint, the warehouse node (number 0) is immediately inserted to complete the current hunting path and initialize a new hunting path [0, 4, 2, 0, 1, 3, 0]. After the reconstruction is completed, the system will verify that the new path meets all the constraints, including time window constraints, capacity constraints, and vehicle number constraints. This mutation strategy based on complete reconstruction breaks through the small perturbation limitations of traditional single-point or exchange mutations. By large-scale solution reconstruction, it effectively avoids the algorithm getting trapped in local optima and significantly increases the probability of obtaining the global optimal solution.
The algorithm adopts an elite retention strategy to ensure that high-quality solutions are not lost. In each generation of optimization, all individuals are evaluated for fitness. When a solution better than the current optimal one is found, the global optimal record is immediately updated. The selection mechanism has a dual guiding role: on the one hand, it enhances the local development ability through the guidance of the leading wolf, quickly converging to the current optimal region; on the other hand, it maintains the global exploration ability through intelligent mutation, continuously exploring new solution spaces.