Next Article in Journal
A Symmetry-Driven Hybrid Framework Integrating ITTAO and sLSTM-Attention for Air Quality Prediction
Previous Article in Journal
Symmetry-Guided Surrogate-Assisted NSGA-II for Multi-Objective Optimization of Renewable Energy Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Research on a Multi-Agent Job Shop Scheduling Method Based on Improved Game Evolution

1
School of Information Science and Engineering, Harbin Institute of Technology at WeiHai, Weihai 264209, China
2
School of Ocean Engineering, Harbin Institute of Technology at WeiHai, Weihai 264209, China
*
Author to whom correspondence should be addressed.
Symmetry 2025, 17(8), 1368; https://doi.org/10.3390/sym17081368
Submission received: 16 July 2025 / Revised: 9 August 2025 / Accepted: 14 August 2025 / Published: 21 August 2025
(This article belongs to the Section Computer)

Abstract

As the global manufacturing industry’s transformation accelerates toward being intelligent, “unmanned”, and low-carbon, manufacturing workshops face conflicts between production schedules and transportation tasks, leading to low efficiency and resource waste. This paper presents a multi-agent collaborative scheduling optimization method based on a hybrid game–genetic framework to address issues like high AGV (Automated Guided Vehicle) idle rates, excessive energy consumption, and uncoordinated equipment scheduling. The method establishes a trinity system integrating distributed decision-making, dynamic coordination, and environment awareness. In this system, the multi-agent decision-making and collaboration process exhibits significant symmetry characteristics. All agents (machine agents, mobile agents, etc.) follow unified optimization criteria and interaction rules, forming a dynamically balanced symmetric scheduling framework in resource competition and collaboration, which ensures fairness and consistency among different agents in task allocation, path planning, and other links. An improved best-response dynamic algorithm is employed in the decision-making layer to solve the multi-agent Nash equilibrium, while the genetic optimization layer enhances the global search capability by encoding scheduling schemes and adjusting crossover/mutation probabilities using dynamic competition factors. The coordination pivot layer updates constraints in real time based on environmental sensing, forming a closed-loop optimization mechanism. Experimental results show that, compared with the traditional genetic algorithm (TGA) and particle swarm optimization (PSO), the proposed method reduces the maximum completion time by 54.5% and 44.4% in simple scenarios and 57.1% in complex scenarios, the AGV idling rate by 68.3% in simple scenarios and 67.5%/77.6% in complex scenarios, and total energy consumption by 15.7%/10.9% in simple scenarios and 25%/18.2% in complex scenarios. This validates the method’s effectiveness in improving resource utilization and energy efficiency, providing a new technical path for intelligent scheduling in manufacturing workshops. Meanwhile, its symmetric multi-agent collaborative framework also offers a reference for the application of symmetry in complex manufacturing system optimization.

1. Introduction

With the accelerating transformation of global industry toward being intelligent and low-carbon, manufacturing workshops are faced with bottleneck problems such as sluggish responses to orders of multiple varieties, high empty driving rates of AGVs, and redundant equipment energy consumption [1,2,3]. Traditional scheduling methods usually separate production planning and logistics planning, resulting in frequent spatiotemporal conflicts between processing and material transportation and a lack of collaborative consideration of the dynamic allocation of storage resources and AGV charging scheduling [4,5,6]. Especially in complex scenarios involving multi-process nesting (e.g., spinning–weaving–dyeing), realizing the multi-dimensional resource collaboration of machines, AGVs, warehouses, and charging piles has become a key challenge in improving the energy efficiency and flexibility of manufacturing workshops [7,8].
The existing manufacturing workshop scheduling research mostly focuses on the optimization of processing equipment allocation while ignoring the dynamic coordination between material transportation and production planning [9]. However, the manufacturing process has the characteristics of a complex process, strong heterogeneity of equipment, and frequent interleaving of AGV transportation routes [10,11]. If the traditional flexible job shop scheduling model is used, it will face three bottlenecks:
  • The independent optimization of the machine allocation and AGV path will lead to an increase in AGV waiting time and an idling rate of more than 30% [12];
  • The equipment starts and stops frequently, and the idle energy consumption accounts for more than 25% of the total energy consumption [13];
  • When dynamic orders are inserted, the fixed scheduling rules make it difficult to quickly reconstruct the production schedule, and the order delivery delay rate increases by 40% [14].
Realizing the coordinated scheduling of machines, AGVs, warehouses, and charging piles and the balanced optimization of multiple objectives (maximum completion time, energy consumption, AGV utilization) has become the core problem of industry intelligent manufacturing [15].

2. Related Work, Problems, and Challenges

2.1. Existing Scheduling Methods for Manufacturing Workshops

The existing methods for the machine–AGV–warehouse–charging pile collaborative scheduling problem can be categorized into three main approaches. The first category consists of exact algorithms based on mathematical programming. For instance, Homayouni and Fontes (2019) [16] developed a mixed-integer linear programming (MILP) model for integrated production–transportation scheduling in flexible job shops with automated material handling systems. Their model optimizes both machine operations and AGV routing while considering setup times and transportation constraints, achieving provably optimal solutions for small-scale problems. However, as problem complexity increases (e.g., large-scale workshops with dynamic task changes), the computational burden escalates exponentially, rendering it impractical for real-time applications [9,16].
The second category involves intelligent optimization algorithms. Song et al. (2020) proposed an improved NSGA-II tailored for multi-objective hybrid flow shop scheduling, emphasizing minimized makespan, energy consumption, and AGV waiting time. The algorithm introduces a dynamic weight adjustment mechanism for Pareto front selection, enhancing convergence speed by 20–30% compared to the traditional NSGA-II [17]. Zhang et al. (2018) advanced a nested particle swarm optimization (NPSO) algorithm where“main particles” explore global machine schedules while “nested particles” refine AGV path planning. This hierarchical structure reduces computation time by 15–20% in large-scale scenarios but still suffers from parameter rigidity (e.g., fixed inertia weights), limiting its adaptability to sudden task fluctuations [18,19].
The third category encompasses distributed methods integrating multi-agent systems (MAS) and game theory. Xiao et al. (2022) designed a digital twin-driven framework for AGV scheduling, where real-time data from physical AGVs update a virtual model to predict bottlenecks and adjust routes dynamically. This system achieves 95% task completion accuracy under uncertain conditions but requires high levels of computational resources for model synchronization [20]. Luo et al. (2021) combined genetic algorithms (GAs) with MAS to initialize agent strategies, enabling self-organizing task allocation among machines and AGVs. However, GA-based initialization often converges to suboptimal solutions in dynamic environments [21]. Li et al. (2020) proposed a multi-agent game-theoretic approach for textile workshop logistics where robots negotiate task assignments via a Stackelberg game model. While this method reduces conflicts by 40%, it struggles with scalability due to the increased communication overhead as agent numbers grow [22].

2.2. Core Challenges in Current Scheduling

However, such methods still have obvious limitations:
  • A single game is prone to conflict in resource allocation due to insufficient negotiation [22];
  • There is a lack of refined modeling of AGV charging scheduling, machine energy consumption differences, and other practical constraints [23];
  • The population diversity declines during evolution, and it is difficult to approach the Pareto front stably [24].
Notably, many of these challenges stem from the lack of symmetry in decision-making frameworks: existing methods often treat machine scheduling, AGV transportation, and energy management as isolated subsystems with disjoint optimization criteria, leading to unbalanced resource competition and inefficient collaboration. In contrast, a symmetric approach—where all agents adhere to unified interaction rules and optimization benchmarks—can inherently reduce conflicts by ensuring fair resource allocation and consistent response mechanisms across heterogeneous components. This symmetry in agent behaviors and decision logics provides a foundational principle to harmonize local objectives (e.g., machine utilization, AGV efficiency) with global goals (e.g., minimal makespan, low energy consumption), laying a theoretical basis for the collaborative scheduling framework proposed herein.
Given the above challenges, this paper proposes a multi-agent cooperative scheduling method based on a game–genetic hybrid framework. By constructing a trinity system architecture of “distributed decision-making–dynamic coordination–environmental awareness”, the method cooperatively models the core links of the manufacturing workshop, such as equipment scheduling, logistics transportation, and energy management. Specifically, at the game decision-making level, each agent (machine agent, mobile agent, etc.) selects a strategy based on its own goals (such as equipment utilization and transportation efficiency) and solves the Nash equilibrium through the improved optimal response dynamic algorithm to realize the dynamic optimization of machine selection, path planning, and the charging schedule. The genetic optimization layer encodes the game results into gene sequences and combines the elite retention strategy with adaptive genetic operators (dynamically adjusting the crossover/mutation probability according to the intensity of competition) to enhance the global search ability. Cooperation with the central layer is used to monitor the equipment status, transportation progress, and other data in real time, and dynamically update the game strategy and genetic constraints to ensure the feasibility of the scheme. Experiments show that this method is superior to the traditional algorithm in terms of task completion time, AGV idling rate, and total energy consumption, and it provides a new technical path for intelligent scheduling in the manufacturing industry.

3. Materials and Methods

3.1. Problem Description

The scheduling problem of a manufacturing workshop based on multiple agents can be formulated as follows: the machine agent set M = { M 1 , M 2 , , M m } , mobile robot agent set A = { A 1 , A 2 , , A a } , warehouse agent set W = { W 1 , W 2 , , W w } , and charging pile agent set C = { C 1 , C 2 , , C c } . A production order contains n jobs J = { J 1 , J 2 , , J n } , where each job J i consists of h i operations O i 1 , O i 2 , , O i h i . Each operation O i j can be processed on a specified machine subset M i j M with a processing time t i j k when executed by machine M k M i j (here, O i j denotes the j-th operation of the i-th job).
Upon the arrival of a warehouse order, the task agent decomposes job J i into operation-level tasks, generating an operation sequence O i 1 O i 2 O i h i , and records the feasible machine set M i j and the corresponding processing time t i j k for each operation. A machine agent M k competes for operation O i j based on its current load Q k (i.e., the number of pending operations in its queue) and the Euclidean distance D i j k between the job’s current position and M k .
After operation assignment, the mobile robot agent A l retrieves the job from warehouse W k and transports it to the target machine. Once all operations O i j of job J i are completed, the mobile robot transports the finished job back to warehouse W k , and the warehouse agent updates the inventory status. Figure 1 shows the multi-agent shop scheduling process.

3.2. Mathematical Model

3.2.1. Objective Function

1. Minimization of Total Task Completion Time
The completion time of job J i is defined as the maximum completion time across all of its operations, and the total task completion time of the workshop is the maximum value among the completion times of all jobs:
f 1 = max 1 i n max 1 j h i C i j k
where C i j k denotes the completion time of operation O i j processed on machine M k , and h i represents the number of operations for job J i .
2. Minimization of Empty Running Rate
The empty running rate is defined as the ratio of the empty running distance of mobile agents to their total running distance (including both empty and loaded segments):
f 2 = i = 1 a t d i idle ( t ) i = 1 a t d i idle ( t ) + d i load ( t )
where a is the number of mobile agents, and d i idle ( t ) / d i load ( t ) denotes the empty/loaded running distance of mobile agent A i at time t.
3. Minimization of Total Energy Consumption
Total energy consumption comprises two components, energy consumed during machine processing and energy consumed by AGV transportation:
f 3 = k = 1 m i = 1 n j = 1 h i E i j k m · t i j k · x i j k + i = 1 a t d i idle ( t ) · E i idle + t d i load ( t ) · E i load
where −m is the number of machines; E i j k m represents the energy consumption per unit time of machine M k processing operation O i j ; − t i j k is the processing time of operation O i j on machine M k ; x i j k is a decision variable (taking 1 if O i j is processed on M k ); − E i idle and E i load denote the energy consumption per unit distance of mobile agent A i under empty and loaded conditions, respectively.

3.2.2. Constraint Condition

  • Precedence Constraint of Operations
    Operations of the same job must be processed in sequence such that a subsequent operation can only start after the completion of its preceding operation:
    C i j S i ( j + 1 ) i , j
    where C i j denotes the completion time of operation O i j , and S i ( j + 1 ) denotes the start time of operation O i ( j + 1 ) .
  • Machine Exclusivity Constraint
    At any given time, a single machine can process at most one operation:
    i = 1 n j = 1 h i x i j k ( t ) 1 k , t
    where x i j k ( t ) = 1 indicates that machine M k is processing operation O i j at time t (0 otherwise).
  • Transportation Constraints
    (a)
    Single-Task Exclusivity
    A mobile robot can transport at most one job at a time:
    i = 1 n j = 1 h i y i j l ( t ) 1 l , t
    where y i j l ( t ) = 1 indicates that mobile robot A l is transporting operation O i j at time t (0 otherwise).
    (b)
    Conflict-Free Path
    At any time, a single path node can be occupied by at most one mobile robot:
    l = 1 a y i j l ( t , l ) 1 l , t
    where l represents a path node (e.g., workstation, warehouse, charging station).
  • Charging Trigger Constraint
    When the battery level of a mobile robot drops below a threshold, transportation must be interrupted for charging:
    E l ( t ) θ E · E l max t t , t = t t + t cm 1 y i j l ( t ) = 0 l
    where − E l ( t ) is the battery level of mobile robot A l at time t; − θ E = 20 % is the battery threshold; E l max is the rated battery capacity; and − t cm denotes the charging duration.
  • Warehouse Exclusivity Constraint
    At any time, a single warehouse can be accessed by at most one mobile robot:
    l = 1 a z l w ( t ) 1 w , t
    where z l w ( t ) = 1 indicates that mobile robot A l is accessing warehouse W w at time t (0 otherwise).

3.3. Multi-Agent System Architecture Design

The multi-agent collaborative workshop scheduling system is based on the trinity design concept of “distributed decision-making–dynamic coordination–environment awareness” [18], which aims to solve the comprehensive scheduling problems relating to resource heterogeneity, task dynamics, and environmental uncertainty in complex industrial scenarios. Its core goal is to achieve efficient task allocation, optimal utilization of resources, and global objectives through the autonomous decision-making and collaborative interaction of agents [22]. Figure 2 shows Multi-intelligence system framework diagram.
This design idea has the advantage of being both distributed and centralized, and the decision management layer is composed of a judgment agent and a task agent. As the core decision-making unit, the judgment agent undertakes the task of receiving workshop demand or order information and decomposes it into a series of specific and operable subtasks with strategic thinking [4]. These subtasks cover the precise selection of processing equipment, the precise setting of processing time, and the reasonable planning of the material transportation path from the warehouse to the processing equipment. Subsequently, the task agent further deepens these decisions, elaborately formulates the detailed material planning, task allocation, warehouse management strategy, and energy allocation scheme [9], and provides clear guidance for subsequent implementation links.
The resource coordination layer includes mobile agents, machine agents, warehouse agents, and charge agents. The mobile agent is mainly responsible for the intelligent scheduling of mobile resources. According to the path environment data obtained in real time from the environment awareness layer, the improved A-star algorithm is used to plan the optimal transportation route, to ensure that the materials can be transported from the warehouse to the processing equipment with the fastest speed and the highest efficiency [25,26]. The machine agent focuses on the real-time monitoring of the operation status of processing equipment, comprehensively collecting various operation data from the equipment and feeding them back to the system in a timely manner [15] to realize the preventive maintenance management of equipment and the dynamic optimization and adjustment of task allocation. The warehouse agent is committed to the refined management of storage resources. Based on the in-depth analysis of material distribution data, it scientifically optimizes the storage layout, efficiently manages the incoming and outgoing processes of materials, and effectively ensures the stability and continuity of the material supply. The charge agent focuses on the charging management of mobile agents. By monitoring the power status of each mobile agent in real time and intelligently planning the charging time and charging path, it can effectively improve energy efficiency while ensuring the continuous and stable operation of the mobile agent [27].
The environment awareness layer is composed of an environment agent, which continuously collects multi-dimensional real-time data such as the path environment information, equipment status, material distribution, and power status [7]. These data not only provide solid data support for the decision-making of the resource coordination layer but also feed back to the decision-making management layer in time to help them quickly adjust the task planning and resource allocation strategy according to the dynamic changes of the actual environment, greatly enhancing the environmental adaptability and flexibility of the system [24].
A key feature of this architecture is the symmetry embedded in the agent interactions: machine agents, mobile agents, warehouse agents, and charge agents, despite their distinct functionalities, operate under a symmetric set of interaction rules and optimization criteria. For instance, machine agents compete for operations based on a unified utility function integrating load and distance, while mobile agents plan paths using a symmetric cost model balancing energy consumption and conflict avoidance. This symmetry ensures that no single agent type is prioritized arbitrarily, fostering fair competition for resources (e.g., machine queues, path nodes) and enabling dynamic equilibrium in task allocation. Moreover, the environment awareness layer provides symmetric information feedback to all agents—real-time data on equipment status, path conflicts, and energy levels are uniformly accessible, preventing information asymmetry that could disrupt collaborative efficiency. This symmetric design not only simplifies system complexity but also enhances robustness.

3.4. An Algorithmic Framework Based on a Hybrid Game–Genetic Algorithm

By combining the dynamic decision-making ability of game theory with the global search advantage of genetic algorithms, a trinity collaborative scheduling system of “distributed game decision-making–adaptive genetic optimization–real-time collaborative feedback” is constructed, which successfully solves the problems relating to the low utilization rate of equipment in manufacturing workshops, the high idling rate of AGVs, and difficult multi-objective conflict optimization. The main idea is to use the three-tier linkage mechanism to achieve a balance between local interests and global objectives, and to transform the production scheduling problem into a collaborative optimization process involving global genetic evolution and a multi-agent dynamic game.

3.4.1. The Decision-Makers in the Game

The game decision-making layer regards the machine agent and mobile agent as game participants, and each machine agent and mobile agent is an independent game participant. The machine agent competes for the processing task of the workpiece, while the mobile agent competes for the transportation task of the workpiece, and the workpiece acts as a passive participant to start the game behavior. Table 1 shows the required strategy room.
The payoff function for the machine agent is as follows:
U M A j = ω 1 D i j ω 2 Q j + ω 3 · I ( T i d l e > T t h )
In order to increase the manufacturing equipment utilization rate, the long-term-idle equipment is triggered to encourage the idle equipment to actively participate in the game. Among them, the equipment that is close to the workpiece in real time has a strong competitive advantage. It can further improve the utilization of equipment by reducing the transportation time of AGVs. At the same time, queue penalty items are introduced to avoid long equipment queues.
The payoff function for the mobile agent is as follows:
U A G V k = L loaded L total μ · E consumed E max ν · t δ ( t )
In order to reduce the idle rate of AGVs, AGVs give priority to the processing of high-load-rate workpieces and introduce the energy consumption item to punish the path with high energy consumption and reduce the energy consumption cost. In addition, the conflict penalty item is introduced. When multiple AGVs occupy the same path node at the same time, the path conflict item forces AGVs to actively avoid such dangers in the process of path planning through the negative reward mechanism.
The enhanced best-response dynamic algorithm is used in the game to link between multiple intelligences. Algorithm 1 shows the enhanced best-response dynamic algorithm. It primarily consists of three stages. The first is the device competition phase, in which each device determines its utility based on distance and queue state and draws in artifacts using the quotation mechanism, and then the artifacts cause the AGVs to begin their transportation task by selecting the device with the highest quotation. The second is the AGV competition phase, which generates candidate paths using the spatiotemporally constrained A-star algorithm, selects the AGV that minimizes the “weighted sum of the arrival time and the energy cost”, and evaluates the paths using a multi-criteria utility function that balances load efficiency, energy consumption, and conflict penalties. The equilibrium validation comes last and guarantees that the Nash equilibrium requirement is met when the machine selection policy and the method converge. When the machine selection strategy and AGV path trajectories are stable within a threshold, it confirms that the Nash equilibrium condition is met.
Algorithm 1 Enhanced Best-Response Dynamic Algorithm
Require:  M A , M O , Jobs, ω 1 , ω 2 , ω 3 , μ , ν , T th , E l max
Ensure:  M sel , P agv , S nash
  1: Initialization:
  2: for each M k M A  do
  3:  Initialize Q k (queue), T k idle (idle time)Section 3.4.1
  4: end for
  5: for each A l M O  do
  6:  Initialize position, E l (energy), current task = ▹ Equation (8)
  7: end for
  8: repeat▹ Dual-stage symmetric game
  9:  Stage 1: Machine Competition
10:  for each operation O i j Jobs  do
11:     Calculate utility M k :
U k = ω 1 D i j k ω 2 | Q k | + ω 3 I ( T k idle > T th )
▹ Equation (10)
12:     Assign O i j to M k * = arg max U k
13:      M k * . q u e u e M k * . q u e u e { O i j } ▹ Update queue
14:  end for
15:  Stage 2: AGV Competition
16:  for each A l M O  do
17:     Generate paths with ST-A* satisfying:
18:       i , j y i j l ( t ) 1 ▹ Single-task exclusivity (Equation (6))
19:       l y i j l ( t , l ) 1 ▹ Path conflict-free (Equation (7))
20:       l z l w ( t ) 1 ▹ Warehouse exclusivity (Equation (9))
21:     Check charging: if E l < 0.2 E l max enforce t y i j l ( t ) = 0 ▹ Equation (8)
22:     Select path maximizing:
U path = L loaded L total μ E consumed E l max ν · conflicts
▹ Equation (11)
23:  end for
24:  Symmetric Update:
25:  for each M k M A  do
26:     if  M k . q u e u e =  then
27:       T k idle T k idle + Δ t ▹ Update idle time
28:     end if
29:  end for
30:  Update AGV selection probabilities via Softmax( U path )▹ Nash refinement
31: until  max | Δ U | < ϵ or i t e r > 100 ▹ Convergence condition
32: Output:  M sel , P agv , S nash

3.4.2. Layer of Genetic Optimization

The genetic optimization layer is deeply integrated with the game process, and the Nash equilibrium solution generated by the multi-agent game process is used as the core of the initial population, combined with the multi-agent income function for coding. The hybrid coding method is used to transform the scheduling scheme into a four-dimensional gene sequence including the task allocation, equipment selection, transportation path, and charging decision genes.
The strategic stability and gain of the intellect during the game, as well as the individual fitness value, are considered when choosing the operation. The following formula is used to determine the optimization goal of the manufacturing workshop’s collaborative scheduling problem based on various intelligences:
Fitness = ω 1 · f ˜ 1 + ω 2 · f ˜ 2 + ω 3 · f ˜ 3
f ˜ i = f i max f i f i max f i min , i { 1 , 2 , 3 }
In the traditional genetic algorithm, the parameters of genetic operators (selection, crossover, mutation) are fixed. Facing the complex and changeable actual production environment, it may not be able to effectively search for the optimal solution. The game-driven adaptive genetic operator proposed in this paper can dynamically adjust the parameters of the genetic operator according to the competition and cooperation between agents in the game process. At the same time, an elite retention strategy based on game equilibrium is adopted, which gives a higher selection probability to the scheduling scheme generated by the agents who have successfully obtained high benefits and stable strategies in the game many times.
For highly competitive game areas (MCFM > 0.6, NCFV > 0.5), the operation of gene crossover and mutation is limited. It is prohibited to select gene segments and the recombination of path genes for high-conflict equipment, and only charging genes and non-competitive genes are allowed to cross, while local highly competitive areas are reserved. In the mutation operation, the method of directional replacement is used to replace the second-best equipment or the path with the least conflict in the game decision-making layer.
Because the gene combination of regions with fierce resource competition may be close to the local optimum, in order to prevent the better scheme obtained by elimination and avoid falling into the local optimum, the above methods can quickly alleviate the conflict and improve the diversity of solutions.
The competition factor for the equipment is as follows:
MCF m = Count bid ( m ) k = 1 M Count bid ( k )
The competition factor for the path nodes is as follows:
NCF v = ConflictCount ( v ) max u V ConflictCount ( u )
In the low-competition game area (MCFM < 0.3, NCFV < 0.2), the whole gene is freely exchanged, and all gene loci can be crossed to obtain a complete mixture of potentially favorable genes. At the same time, in order to improve the global exploration of the low-competition area in mutation operation, the candidate resources with low historical use frequency (the same type of unreferenced devices, adjacent path nodes not selected by the AGV) are introduced to replace the original genes with high probability randomly, so as to improve the global exploration ability of the low-competition area, increase the diversity of the population, and break through the current low population dilemma.
p c r o s s = p b a s e _ c r o s s · ( 1 M C F m ) , Highly competitive area p b a s e _ c r o s s , p b a s e _ c r o s s · ( 1 + M C F m ) , Low-competition area
In the selection operation of the genetic algorithm, the traditional method is mainly based on the individual fitness value, which makes it easy to ignore some scheduling schemes that do not have the highest fitness value but have an important value in actual production. This paper introduces the elite reservation strategy based on game equilibrium, which not only retains the individuals with high fitness but also retains the scheduling schemes that reach the Nash equilibrium state in the game process and contribute greatly to the overall optimization goal. The first is the fitness elite, which retains the top 10% of the individuals to maintain the overall advantage. The second is the single-objective contribution elite, which calculates the contribution value of each objective and retains the top 5% of individuals to maintain the diversity of single-objective optimization. Third, the high-quality equilibrium solution in the Nash equilibrium solution is used to increase the characteristics of the game equilibrium, and the selected elite individuals are injected into the genetic operation process, which not only retains high-quality genes and prevents the loss of excellent solutions, but also maintains the diversity of the population, providing a high-quality evolutionary basis for subsequent genetic optimization. Algorithm 2 shows the Integrated Competition-Driven Genetic Optimization.
Algorithm 2 Integrated Competition-Driven Genetic Optimization
Require:  P t : Current population,
  1: S nash : Nash equilibrium solutions,
  2: Count bid ( m ) : Machine bid counts,
  3: ConflictCount ( v ) : Path conflict counts
Ensure:  P t + 1 : Optimized population
  4: Phase 1: Adaptive Genetic Operations
  5: for each parent pair ( p 1 , p 2 )  do
  6:  Initialize offspring c 1 , c 2 ▹ Competition-aware crossover
  7:  for each gene do
  8:      if Machine selection gene then
  9:      Compute MCF m Count bid ( m ) k Count bid ( k ) ▹ Equation (14)
10:      Adjust p cross based on MCF m
11:      if  rand ( ) < p cross MCF m > 0.6  then
12:         Apply restricted crossover▹ High-competition zone
13:      end if
14:      else if Path gene then
15:      Compute NCF v ConflictCount ( v ) max u ConflictCount ( u ) ▹ Equation (15)
16:      Adjust p cross based on NCF v
17:      if  rand ( ) < p cross  then
18:         Verify energy: if E l < 0.2 E l max enforce charging ▹ Equation (8)
19:         Recombine paths with ST-A* verification ▹ Equation (9)
20:      end if
21:      end if
22:  end for
▹ Strategy-guided mutation
23:  for each gene do
24:      if Machine gene MCF m > 0.6  then
25:      Replace with low-competition machine
26:      else if Machine gene MCF m < 0.3  then
27:      Explore underutilized machines
28:      else if Path gene NCF v > 0.5  then
29:      Reroute using conflict-aware planning
30:      else if Path gene NCF v < 0.2  then
31:      Explore adjacent nodes
32:      end if
33:  end for
34:  Add { c 1 , c 2 } to O t
35: end for
36: Phase 2: Elite Selection
37: Normalize objectives: f ^ k = f k max f k f k max f k min
38: Select:
39:  Top 10% by ω k f ˜ k ▹ Global fitness
40:  Top 5% per objective ▹ Single-objective excellence
41:  Nash solutions with Stability > 0.8 Contribution > 0.7 Section 3.4.2 constraint
42: Aggregate elite pool E t
43: Phase 3: Dynamic Adaptation
44: P t + 1 O t E t
45: Update H m 0.9 H m + 0.1 · MCF m ▹ Equation (14)
46: if  t mod 5 = 0  then
47:  Refresh competition metrics
48:  Adjust ω k via gradient ascent on E nash Section 3.4.2 mechanism
49: end if

3.4.3. Coordinated Pivot Level

The collaboration hub layer is responsible for collecting the real-time feedback information of each agent, and passing the collected information to the game decision-making layer and genetic optimization layer, respectively. For the game decision-making layer, this real-time information enables the agent to adjust the strategy based on the latest production environment. If a device fails, the machine agent will feed back the failure information to the collaboration hub layer, which will then pass it to the game decision-making layer. In the subsequent task allocation game, other machine agents will consider this situation and re-select the strategy. For the genetic optimization layer, this information is used to update the constraints. If it is found that the power of a mobile agent is low, in the process of genetic optimization, for the gene fragments related to the transport task and charging plan of the mobile agent, more attention should be paid to meeting the charging trigger constraint during genetic operation. Algorithm 3 Show the specific content of Coordinated Pivot Level.
Algorithm 3 Real-Time Coordination Mechanism
Require:  D real : Real-time monitoring data (equipment status, energy, paths),
  1: P t : Current population of genetic optimization layer,
  2: S nash : Nash equilibrium solutions from game layer,
  3: M C F m : Machine competition factors (Equation (13)),
  4: N C F v : Path conflict factors (Equation (14))
Ensure:  C updated : Updated constraints for scheduling,
  5: F valid : Feasibility flag (True if all constraints are satisfied)
  6: Step 1: Update energy constraints (Equation (8))
  7: for all A l A  do▹ AGV set A as defined in Chapter 3
  8:  if  A l . battery < 0.2 · E l max  then▹ Battery threshold θ E = 20 %
  9:      C updated C updated { ChargeTask ( A l ) } ▹ Add charging task to constraints
10:  end if
11: end for
12: Step 2: Update path conflict constraints (Equation (9))
13: Calculate path node occupancy:
A l A y i j l ( t , l ) 1 t , l PathNodes
▹ Ensure no concurrent occupancy of path nodes
14: Step 3: Update competition factors (Equations (13) and (14))
15: Machine competition factors:
M C F m t + 1 0.7 · M C F m t + 0.3 · BidCount ( M k ) max M k M BidCount ( M k )
▹ Exponential smoothing for dynamic adaptation
16: Path conflict factors:
N C F v t + 1 ConflictCount ( l ) max l PathNodes ConflictCount ( l )
▹ Normalize by maximum conflict count
17: Step 4: Verify feasibility against core constraints
18: Check operation precedence constraint (Equation (4)):
C i j S i ( j + 1 ) i , j
19: Verify machine exclusivity constraint (Equation (5)):
i = 1 n j = 1 h i x i j k ( t ) 1 M k M , t
20: Validate transportation-charging exclusivity (Equation (8)):
t = t t + t cm 1 y i j l ( t ) = 0 when charging is triggered
21: Step 5: Set feasibility flag
22: F valid True if all constraints in C updated are satisfied, else False
23: return  C updated , F valid
At the same time, the coordination hub layer is also responsible for checking the rationality of the initial scheduling scheme generated by the game decision-making layer and verifying the feasibility of the final scheduling scheme output by the genetic optimization layer according to the actual production environment constraints, such as the material transportation constraints, charging trigger constraints, warehouse exclusivity constraints, conflict-free path constraints, etc.

4. Simulation Evaluation

This paper uses a Python programming environment to develop a virtual multi-agent system based on the Python language and the concept of multi-agent modeling. The version of Python is 3.13.0. In order to provide effective computing support during algorithm execution, a 12th-generation Intel (R) core (TM) i7-12800hx CPU with 32 GB of random-access memory is used.

4.1. Simulation Setup

In order to verify the effectiveness of the hybrid game–genetic algorithm in multi-agent job shop scheduling, the traditional genetic algorithm and particle swarm optimization algorithm are compared. In the implementation of the traditional genetic algorithm, the algorithm process mainly includes five core steps: population initialization, fitness evaluation, selection, crossover, mutation, and population update. The initial population is composed of randomly generated individuals. In the fitness evaluation stage, the maximum task completion time, AGV idle rate, and energy consumption are calculated by decoding the gene generation scheduling scheme, and the fitness value is obtained by weighted summation. The selection operation uses a roulette mechanism and selects the parent individual according to the fitness ratio. The crossover operation carries out a two-point crossover with a probability of 0.8, and the mutation operation randomly perturbs the gene value with a probability of 0.2. The algorithm updates the population by replacing the parent completely with the offspring, but it lacks the elite reservation mechanism. The limitations of this algorithm are that the search ability is limited due to fixed parameters; the randomness of the crossover mutation operation makes the population diversity decay rapidly and it is easy to fall into local optimization; and the greedy strategy used in the decoding process may lead to suboptimal resource allocation. The particle swarm optimization algorithm is compared with traditional PSO based on a number of optimizations. The algorithm process includes six stages: particle initialization, fitness evaluation, speed update, position update, individual and global optimal update, and particle reset. The genetic code of particles is consistent with that of the genetic algorithm, but dimension checking is added during initialization to ensure the validity of the parameters. The speed update formula introduces a dynamic inertia weight (linear attenuation from 0.9 to 0.4) and combines the cognitive factor (1.5) and social factor (1.5) to balance global and local searching. After the position is updated, the genetic legitimacy is ensured by rounding and boundary constraints. The particle reset mechanism is introduced into the algorithm, and the particle with the worst fitness is replaced by a new randomly generated solution every 50 runs to effectively maintain the population diversity.
For both simple workshops and complex workshops, simulation experiments with different workpiece quantities are carried out using three algorithms. In the simple workshop, experiments are set with 20 workpieces and 100 workpieces. The configuration of other parameters remains consistent: 5 AGVs, 8 pieces of equipment, 1 charging pile, and a workshop site 20 × 20 m in size, using the form of a grid map. In the complex workshop, experiments are set with 40 workpieces and 100 workpieces. The configuration of other parameters is consistent: 10 AGVs, 20 pieces of equipment, 3 charging piles, and a workshop site 40 × 40 m in size.

4.2. Comparative Analysis of Simulation Results

4.2.1. Algorithm Analysis in Simple Scenarios

In order to test the reliability of the algorithm, the task completion time, total energy consumption, idling rate, and fitness value of the three algorithms are tested. By comparing the four performance indexes of the three algorithms (the hybrid game–genetic algorithm, traditional genetic algorithm, and particle swarm optimization algorithm) in the simple workshop scenario, it can be concluded that the three algorithms have different performance indexes. Table 2 shows the comparison between our algorithm and other algorithms in different situations.
1. Fitness Values:
As shown in the Figure 3 and Figure 4. In the simple workshop experiments (5 AGVs, 8 machines, 1 charging pile), the hybrid game–genetic algorithm (HG-GA) demonstrates robust optimization performance across both the 20-workpiece and the 100-workpiece scenario: for the 20-workpiece case, the fitness value plummets from 330 to approximately 260 within the first 50 iterations and then stabilizes at 250; for the 100-workpiece scenario, despite the initial fitness soaring to 400 due to the increased task complexity, the HG-GA still drives the value below 300 within 80 iterations and maintains a stable convergence around 295. Compared with the traditional genetic algorithm (TGA) and particle swarm optimization (PSO), the HG-GA exhibits faster convergence and higher precision. The TGA’s fitness fluctuates erratically between 290 and 340 (20 workpieces) and between 350 and 420 (100 workpieces) owing to fixed crossover/mutation parameters and random operations, resulting in slow convergence and frequent entrapment in local optima. PSO, although showing a rapid initial decline, stagnates at 270 (20 workpieces) and 320 (100 workpieces) after 100 iterations, exposing its inherent deficiency in local search ability.
2. Completion Times:
As shown in the Figure 5 and Figure 6. In terms of the completion time index, the hybrid game–genetic algorithm (HG-GA) exhibits excellent convergence performance across different workpiece scales: for smaller workloads, its completion time rapidly drops from 100 to 55 within 50 iterations and stabilizes at 50; for larger workloads, despite the initial increase in completion time due to more tasks, it still quickly decreases to around 300 within 100 iterations and maintains stability. In contrast, the traditional genetic algorithm (TGA) shows slow convergence, with no stable result even in the first 150 generations, and its curve fluctuates significantly due to fixed crossover and mutation strategies.
3. Idle Rates:
As shown in the Figure 7 and Figure 8. The AGV idling rate of the hybrid game–genetic algorithm (HG-GA) exhibits strong optimization stability across different workpiece scales. In scenarios with fewer workpieces, it rapidly decreases from 40% to less than 10% within 50 iterations and remains stable thereafter. Even in scenarios with more workpieces, despite a slight initial increase in the idling rate due to increased transportation demand, it still drops to around 13% within 100 iterations and remains low. In contrast, the idling rate of the traditional genetic algorithm (TGA) fluctuates between 20% and 40% in both scenarios, with no obvious downward trend, as its random selection process easily leads to uneven distribution of AGV workloads. The improved particle swarm optimization (PSO) shows an upward trend in the idling rate as iterations proceed—though its initial value is around 30% (lower than the TGA), it fails to effectively avoid path conflicts or balance task assignments in multi-objective optimization, resulting in a gradual increase in empty running.
4. Total Energy Consumption:
As shown in the Figure 9 and Figure 10. The total energy consumption of the hybrid game–genetic algorithm (HG-GA) shows strong optimization stability across different workpiece scales: in scenarios with fewer workpieces, its energy consumption rapidly decreases from 1000 units to about 780 units within 50 iterations and remains stable thereafter; in scenarios with more workpieces, despite the initial increase in energy consumption due to increased processing and transportation tasks, it still drops to around 4000 units within 100 iterations and remains low. In contrast, the energy consumption of the traditional genetic algorithm (TGA) fluctuates between 850 and 1000 units (fewer workpieces) and between 4500 and 5500 units (more workpieces) due to fixed crossover/mutation strategies and a lack of targeted energy optimization, with no obvious downward trend. The particle swarm optimization (PSO) algorithm, although starting with lower initial energy consumption (around 800 units for fewer workpieces), fails to further optimize effectively in later iterations, stabilizing at 800–950 units (fewer workpieces) and 4800–5200 units (more workpieces) due to an insufficient local search capability in handling complex energy consumption constraints.

4.2.2. Algorithm Analysis in Complex Scenarios

By analyzing the experimental results for the complex workshops, it can be seen that the hybrid game–genetic algorithm has a better optimization ability in dealing with complex situations. A specific analysis follows.
1. Fitness Values:
As shown in the Figure 11 and Figure 12. In complex workshop scenarios, the hybrid game–genetic algorithm (HG-GA) exhibits robust optimization performance across different workpiece scales: for 40-workpiece scenarios, its fitness value rapidly decreases from approximately 1050 to around 550 within 200 iterations and stabilizes at this low value thereafter; for 100-workpiece scenarios, despite the initial fitness surging to about 2100 due to increased task complexity, it still drops to around 1500 within 300 iterations and maintains stable convergence. This demonstrates its strong ability to efficiently search for high-quality solutions in complex environments, with superior convergence speed and accuracy compared to other algorithms.
In contrast, the traditional genetic algorithm (TGA) shows significant fluctuations in its fitness curve. In 40-workpiece scenarios, its fitness fluctuates between 750 and 900, and, in 100-workpiece scenarios, it oscillates between 1700 and 2000 with slow convergence. The final fitness value remains higher (around 780 for 40 workpieces and 1900 for 100 workpieces) than that of the HG-GA, indicating its tendency to fall into local optima in complex scenarios and its insufficient global search capability.
The particle swarm optimization (PSO) algorithm declines rapidly in the early stage—its fitness drops from 1000 to around 750 in 40-workpiece scenarios and from 2000 to about 1700 in 100-workpiece scenarios within the first 100 iterations—but plateaus in the middle stage. The final fitness value (around 700 for 40 workpieces and 1700 for 100 workpieces) is still higher than that of the HG-GA, reflecting its low later-stage search efficiency and limited adaptability to complex multi-variable scenarios.
2. Completion Times:
As shown in the Figure 13 and Figure 14. In complex workshop scenarios, the hybrid game–genetic algorithm (HG-GA) demonstrates excellent optimization performance in terms of completion time across different workpiece scales: for 40-workpiece scenarios, its completion time rapidly decreases from around 240 units to approximately 60 units within 200 iterations and stabilizes at this low level thereafter; for 100-workpiece scenarios, despite the initial completion time surging to about 500 units due to the increased task load, it still drops to around 300 units within 300 iterations and maintains stable convergence. This reflects its strong ability to optimize workshop resource scheduling, effectively shortening the production cycle and improving overall efficiency. In contrast, the traditional genetic algorithm (TGA) shows frequent fluctuations in its completion time curve. In 40-workpiece scenarios, its completion time fluctuates between 120 and 180 units, and, in 100-workpiece scenarios, it oscillates between 400 and 500 units with slow convergence. The final completion time (about 140 units for 40 workpieces and 450 units for 100 workpieces) is significantly higher than that of the HG-GA, indicating that the TGA struggles to balance resources in complex task allocation, leading to prolonged production cycles. The particle swarm optimization (PSO) algorithm, although able to converge, exhibits a limited optimization effect. In 40-workpiece scenarios, its completion time stabilizes at around 140 units, and, in 100-workpiece scenarios, it plateaus at about 450 units, which is close to the TGA’s performance. This indicates that PSO still has limitations in searching for the global optimal solution when scheduling multiple devices and multiple jobs, especially in handling the complex constraints of resource competition and path conflicts in large-scale scenarios.
3. Idle Rates:
As shown in the Figure 15 and Figure 16. In complex workshop scenarios, the AGV idle rate of the hybrid game–genetic algorithm (HG-GA) maintains superior optimization performance across different workpiece scales: for 40-workpiece scenarios, it rapidly decreases from around 60% to approximately 13% within 200 iterations and stabilizes at this low level; for 100-workpiece scenarios, despite the increased transportation demand, it still drops to about 13% within 300 iterations and remains stable, demonstrating its ability to reasonably schedule equipment resources and significantly reduce idle time to improve resource utilization. In contrast, the idle rate of the traditional genetic algorithm (TGA) decreases slowly and fluctuates in both scenarios, finally stabilizing at around 40%, reflecting the difficulty it experiences in coordinating equipment and task allocation in complex scenarios, which leads to resource waste. The particle swarm optimization (PSO) algorithm performs even worse, with its idle rate stabilizing at about 58% in both 40-workpiece and 100-workpiece scenarios, significantly higher than that of the HG-GA, indicating its limitations in multi-objective optimization, especially in controlling the AGV idle rate.
4. Total Energy Consumption:
As shown in the Figure 17 and Figure 18. In complex workshop scenarios, the total energy consumption of the hybrid game–genetic algorithm (HG-GA) exhibits robust optimization performance across different workpiece scales: for 40-workpiece scenarios, its energy consumption rapidly decreases from approximately 3200 units to around 1800 units within 200 iterations and stabilizes at this low level thereafter; for 100-workpiece scenarios, despite the initial energy consumption surging to about 7000 units due to increased processing and transportation tasks, it still drops to around 4500 units within 300 iterations and maintains stable convergence. This indicates that the HG-GA is highly efficient in optimizing AGV path planning and equipment operation schedules, effectively reducing system energy consumption.
In contrast, the energy consumption of the traditional genetic algorithm (TGA) fluctuates significantly in both scenarios: in 40-workpiece scenarios, it oscillates around 2400 units, and, in 100-workpiece scenarios, it fluctuates between 6000 and 6500 units with no obvious downward trend. The final energy consumption (about 2400 units for 40 workpieces and 6500 units for 100 workpieces) is significantly higher than that of the HG-GA, reflecting its lack of comprehensive coordination ability in integrating factors such as AGV movement paths and equipment start–stop in complex energy consumption optimization scenarios. The particle swarm optimization (PSO) algorithm, although converging faster than the TGA, still shows higher energy consumption: in 40-workpiece scenarios, it stabilizes at around 2200 units after convergence, and, in 100-workpiece scenarios, it plateaus at about 5500 units. This is higher than the HG-GA in both cases, reflecting its limitations in multi-objective optimization, particularly in refining energy-saving strategies for complex resource interactions due to insufficient local search capability.

4.2.3. Component Contribution Analysis Through Ablation Studies

1. Fitness Comparison Performance and Cause Analysis:
As shown in the Figure 19 and Figure 20. In both the 100-workpiece (new) and 40-workpiece (old) scenarios, the hybrid game–GA has the lowest fitness value and the most stable convergence: it stabilizes at around 2500 for 100 workpieces and around 550 for 40 workpieces. The fitness value of the Only-Game Layer is the highest, with a final value of over 2400 at 100 artifacts and approximately 700–800 at 40 artifacts. The fitness value of the No-Gate GA (genetic layer only) is between the two: approximately 2200–2400 for 100 samples and 600–700 for 40 samples.
The reason is that, in the hybrid game–GA, the game layer provides high-quality initial elite solutions for genetic optimization through the Nash equilibrium, reducing the search burden on the genetic layer. However, the Only-Game Layer lacks the global search capability of the genetic layer and relies only on local negotiation, making it difficult to escape suboptimal solutions. The No-Gate GA has low efficiency in exploring high-quality solutions due to the random generation of initial solutions. Although there is global optimization, the quality of the starting point is insufficient.
2. Completion Time Performance and Cause Analysis:
As shown in the Figure 21 and Figure 22. In terms of maximum completion time, the hybrid game–GA performs the best: it stabilizes at around 300 for 100 pieces and around 60 for 40 pieces. The completion time of the Only-Game Layer is the longest, fluctuating between 500 and 600 for 100 pieces and around 150–200 for 40 pieces. The completion time of the No-Gate GA is in the middle, with 400–500 for 100 workpieces and 100–150 for 40 workpieces.
The reason is that, in the hybrid game–GA, the game layer dynamically coordinates the task allocation between machines and AGVs, reducing spatiotemporal conflicts. The genetic layer optimizes the global task ranking and shortens the overall cycle through adaptive operators. The lack of global coordination in the local decisions of the Only-Game Layer can easily lead to poor task connection. Although the No-Gate GA has global optimization, it lacks real-time conflict adjustment at the game layer, and fixed strategies find it difficult to cope with dynamic task changes.
3. Performance and Cause Analysis of AGV Idle Rate Comparison:
As shown in the Figure 23 and Figure 24. In terms of the AGV idle rate, the hybrid game–GA has the lowest: it remains stable at around 13% for both 100 and 40 workpieces. The Only-Game Layer has the highest idle rate: about 70% for 100 artifacts and 60% for 40 artifacts. The idle rate of the No-Gate GA is about 40%, with relatively small fluctuations in both scenarios.
The reason is that the game layer of the hybrid game–GA dynamically balances the AGV load through the AGV utility function (prioritizing high-load-rate tasks and punishing conflict paths). The genetic layer further optimizes path allocation and reduces empty runs. The Only-Game Layer relies solely on local negotiation and lacks a global load-balancing mechanism, resulting in uneven busy and idle AGVs. The fixed genetic strategy of the No-Gate GA finds it difficult to adjust AGV task allocation in real time and cannot effectively reduce idle time.
4. Performance and Cause Analysis of Total Energy Consumption Comparison:
As shown in the Figure 25 and Figure 26. In terms of total energy consumption, the hybrid game–GA has the lowest: it stabilizes at around 4500 for 100 workpieces and around 1800 for 40 workpieces. The Only-Game Layer has the highest energy consumption: approximately 7000 for 100 artifacts and 3200 for 40 artifacts. The energy consumption of the No-Gate GA is in the middle: about 6500 for 100 workpieces and about 2400 for 40 workpieces.
The reason is that, in the hybrid game–GA, the game layer optimizes AGV transportation energy consumption through path conflict penalties and energy consumption terms. The genetic layer integrates the energy consumption models of machine processing and AGV transportation, reducing equipment start–stop and idle energy consumption through adaptive operators. The Only-Game Layer local decision-making does not globally optimize energy consumption factors, resulting in high energy consumption path selection. The No-Gate GA lacks energy-sensitive initial solutions in the game layer, making it difficult to accurately reduce energy consumption in key links through global optimization.
In theory, the difference in the ablation experimental results is due to the architecture’s ability to handle local and global optimization: only the game layer is prone to falling into suboptimal solutions due to the lack of global search, and only the genetic layer is prone to insufficient optimization due to the random initial solution and lack of dynamic coordination. The hybrid framework provides high-quality initial elite solutions through the game layer, global optimization through the genetic layer, and real-time feedback through the coordination layer, achieving a balance between local conflict resolution and global resource coordination. Its core advantage lies in the integration of local dynamic negotiation and global optimization advantages, which not only improves the efficiency of the optimization starting point, but also ensures the accuracy and dynamic adaptability of the global solution.

5. Conclusions and Discussion

This paper proposes a multi-agent collaborative scheduling technology for manufacturing workshops based on a hybrid game–genetic framework, which effectively addresses critical issues in modern manufacturing such as AGV energy redundancy, delayed response to multi-variety orders, and high idle rates. To achieve comprehensive coordination of equipment scheduling, logistics transportation, and energy management, the framework integrates the global search advantages of genetic algorithms with the local optimization capabilities of game theory through a trinity architecture of “distributed decision-making–dynamic coordination–and environmental awareness”. This section synthesizes key findings, discusses theoretical and practical implications, analyzes limitations, and outlines future research directions.

5.1. Core Contributions and Performance Validation

The experimental results demonstrate that the proposed hybrid game–genetic algorithm (HG-GA) achieves significant advantages in multi-agent job shop scheduling across both simple and complex scenarios, validating its effectiveness and scalability. In simple workshop scenarios (20 workpieces, 5 AGVs, 8 machines, 1 charging pile), the HG-GA outperforms traditional genetic algorithms (TGA) and particle swarm optimization (PSO) in all key metrics:
  • The fitness value converges rapidly from an initial 330 to a stable 250, representing a 20.6% improvement over the TGA (315) and a 7.4% improvement over PSO (270).
  • The maximum completion time is shortened to 50 units, a 54.5% reduction compared to the TGA (110 units) and a 44.4% reduction compared to PSO (90 units), highlighting its ability to streamline production cycles.
  • The AGV idle rate is reduced from 40% to 9.5%, achieving a 68.3% optimization over both the TGA and PSO (30% each), effectively mitigating resource waste.
  • Total energy consumption drops from 1000 units to 780 units, with 15.7% lower consumption than the TGA (925 units) and 10.9% lower consumption than PSO (875 units), demonstrating its energy-saving potential.
In complex workshop scenarios (40 workpieces, 10 AGVs, 20 machines, 3 charging piles), the advantages of the HG-GA are even more pronounced:
  • The fitness value stabilizes at 550, a 29.5% improvement over the TGA (780) and a 21.4% improvement over PSO (700), reflecting stronger adaptability to high-complexity tasks.
  • The maximum completion time is reduced to 60 units, a 57.1% reduction compared to both the TGA and PSO (140 units), underscoring its efficiency in coordinating multi-resource interactions.
  • The AGV idle rate is optimized to 13%, a 67.5% reduction compared to the TGA (40%) and a 77.6% reduction compared to PSO (58%), indicating superior dynamic load balancing.
  • Total energy consumption is lowered to 1800 units, 25% less than the TGA (2400 units) and 18.2% less than PSO (2200 units), validating its effectiveness in integrating processing and transportation energy optimization.
Even with increased task scales (100 workpieces in both scenarios), the HG-GA maintains stable performance: completion time, idle rate, and energy consumption remain significantly lower than those of baseline algorithms, confirming its scalability to large-scale manufacturing environments.

5.2. Theoretical Innovations and Mechanism Analysis

The superior performance of the HG-GA stems from three core theoretical innovations which address long-standing bottlenecks in existing scheduling methods.

5.2.1. Hybrid Game–Genetic Synergy

The integration of game theory and genetic algorithms creates a “local negotiation–global optimization”closed-loop mechanism. The game decision-making layer generates high-quality initial solutions through the Nash equilibrium, reducing the search burden for the genetic layer. For example, machine agents compete based on load and distance utility functions, while AGV agents optimize paths using load efficiency, energy consumption, and conflict penalty terms. This ensures that initial task allocations and path plans are already conflict-mitigated and locally optimal. The genetic optimization layer then refines these solutions using adaptive operators (dynamic crossover/mutation probabilities based on competition factors) and elite retention strategies (incorporating Nash equilibrium solutions), avoiding local optima and enhancing global search precision. This synergy overcomes the limitations of game theory alone (insufficient global optimization) and genetic algorithms alone (random initial solutions and poor dynamic adaptability).

5.2.2. Symmetric Multi-Agent Framework

A key theoretical breakthrough is the introduction of symmetry in agent interactions. All agents (machine, AGV, warehouse, charging) adhere to unified optimization criteria and interaction rules: machine agents compete using a symmetric utility function balancing load and distance; AGV agents plan paths with symmetric cost models integrating energy and conflict avoidance; and the environment awareness layer provides symmetric real-time data feedback to all agents. This symmetry eliminates arbitrary biases in resource allocation, fosters fair competition, and enables dynamic equilibrium between cooperation and competition. For instance, symmetric information sharing prevents information asymmetry-induced conflicts, while unified utility functions ensure consistent decision logic across heterogeneous agents, simplifying system complexity and enhancing robustness.

5.2.3. Real-Time Coordination Mechanism

The coordination pivot layer forms a closed-loop optimization system by dynamically updating constraints based on environmental sensing (e.g., equipment status, AGV battery levels, path conflicts). This layer verifies the feasibility of game-generated initial schemes and genetic-optimized final schemes against core constraints (operation precedence, machine exclusivity, charging triggers), ensuring practical applicability. For example, when an AGV’s battery drops below 20%, the coordination layer immediately updates charging constraints, prompting the genetic layer to prioritize charging-related gene segments during optimization. This real-time feedback mechanism enables the system to adapt to dynamic disruptions (e.g., equipment failures, urgent order insertions), addressing the rigidity of traditional fixed scheduling rules.

5.3. Practical Implications and Industry Relevance

The proposed method holds significant practical value for advancing intelligent and low-carbon transformation in manufacturing.

5.3.1. Efficiency and Resource Utilization

By reducing the maximum completion time by up to 57.1% and the AGV idle rate by up to 77.6%, the HG-GA directly enhances production efficiency and resource utilization. In complex multi-process scenarios (e.g., spinning–weaving–dyeing), the method enables seamless collaboration between machines, AGVs, warehouses, and charging piles, mitigating spatiotemporal conflicts between processing and transportation—a critical pain point in traditional discrete manufacturing.

5.3.2. Energy Conservation and Low-Carbon Development

With total energy consumption reduced by up to 30.8% in complex scenarios, the method aligns with global low-carbon manufacturing goals. Its refined energy modeling (differentiating machine processing energy and AGV idle/loaded energy) and targeted optimization (e.g., penalizing high-energy paths in AGV utility functions) provide a technical path for reducing carbon emissions in workshops.

5.3.3. Adaptability to Dynamic Scenarios

The symmetric framework and real-time coordination mechanism enable the system to adapt to dynamic changes such as fluctuating order volumes, equipment failures, and AGV battery variations. This addresses the 40% order delivery delay rate caused by fixed scheduling rules in traditional methods, enhancing manufacturing flexibility and responsiveness to market demands.

5.3.4. Industrial Implementation and Validation

Beyond theoretical analysis, the HG-GA framework has achieved notable progress in industrial translation through collaborative efforts:
1. Patent Protection and R&D Collaboration
The core scheduling logic underlying the HG-GA has been awarded a National Invention Patent (patent no. 202510578934.1) in China. Building on this, the method was integrated into a pilot phase of a national key R&D project, conducted in collaboration with Shandong Weiqiao Pioneering Group, an industry leader in advanced manufacturing.
2. Pilot Deployment in Textile Manufacturing
Within the project’s experimental workshop (supporting a textile production line), 10 AGVs were deployed, interfacing with a co-developed informatization management platform. This platform integrates real-time monitoring, dynamic scheduling, and energy management, embedding the HG-GA’s multi-agent coordination mechanisms into day-to-day operations.

5.4. Limitations and Future Work

While the HG-GA demonstrates strong performance, several limitations warrant further research.

5.4.1. Current Limitations

1. Computational Complexity: The hybrid framework involves iterative game negotiations and genetic evolution, which may increase computational overhead in ultra-large-scale scenarios (e.g., 500+ workpieces or 50+ AGVs).
2. Energy Model Refinement: The current energy model simplifies machine and AGV energy consumption as being time-/distance-dependent; future work could incorporate more granular factors (e.g., machine start–stop energy loss, AGV path gradient effects).
3. Human–Machine Collaboration: The study focuses on fully automated multi-agent scheduling but does not explicitly integrate human operator roles (e.g., manual intervention in exception handling), which is critical in Industry 5.0 human-centric manufacturing.

5.4.2. Future Research Directions

1. Scalability Enhancement: Lightweight algorithms (e.g., deep reinforcement learning for game strategy initialization) may be explored to reduce computational complexity in large-scale scenarios.
2. Multi-Scenario Extension: The symmetric framework may be extended to cross-factory supply chains, where symmetric interaction rules between factories could optimize global logistics and resource allocation.
3. Digital Twin Integration: Digital twin technology may be included to build a virtual–real mapping system, enabling predictive scheduling and pre-emptive conflict resolution based on real-time virtual simulation.
4. Human–Agent Collaboration: Human-in-the-loop mechanisms may be introduced to balance automated scheduling with human expertise, enhancing system robustness in unstructured environments.

Author Contributions

Conceptualization, W.X.; methodology, W.X.; writing—original draft preparation, B.D.; software, B.D.; validation, B.D.; data curation, J.M.; formal analysis, X.Z.; resources, X.Z.; writing—review and editing, W.X., J.M. and J.C.; supervision, W.X., J.M. and J.C.; project administration, J.C.; funding acquisition, J.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Key R&D Program of China (nos. 2022YFB4700602 and 2022YFB4700601) and the Ministry of education industry–university cooperative education project (no. 22086429092517).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors have no relevant financial or non-financial interests to disclose.

References

  1. Li, W.; Yao, Y.; Li, X. Integrated Scheduling of Flexible Job-Shop Considering Heterogeneous AGVs. Comput. Integr. Manuf. Syst. 2024, 31, 1539–1554. [Google Scholar] [CrossRef]
  2. Tian, Z.; Jiang, X.; Liu, W.; Zhao, B.; Liu, S.; Tan, Q.; Tian, G. Lot-Streaming Workshop Scheduling with Operation Flexibility: Review and Extension. Systems 2025, 13, 271. [Google Scholar] [CrossRef]
  3. Destouet, C.; Tlahig, H.; Bettayeb, B.; Mazari, B. Flexible job shop scheduling problem under Industry 5.0: A survey on human reintegration, environmental consideration and resilience improvement. J. Manuf. Syst. 2023, 67, 155–173. [Google Scholar] [CrossRef]
  4. Wang, S.; Li, X.; Gao, L.; Li, P. Research Review of Distributed Workshop Scheduling. J. Huazhong Univ. Sci. Technol. (Nat. Sci. Ed.) 2022, 50, 1–10. [Google Scholar] [CrossRef]
  5. Dauzère-Pérés, S.; Ding, J.; Shen, L.; Tamsaaouet, K. The flexible job shop scheduling problem: A review. Eur. J. Oper. Res. 2024, 314, 409–432. [Google Scholar] [CrossRef]
  6. Jiang, B.; Ma, Y.; Chen, L.; Huang, B.; Huang, Y.; Guan, L. A Review on Intelligent Scheduling and Optimization for Flexible Job Shop. Int. J. Control. Autom. Syst. 2023, 21, 3127–3150. [Google Scholar] [CrossRef]
  7. Yu, B. Review of Flexible Flow Shop Scheduling. Mod. Manuf. Eng. 2022, 41, 154–162+71. [Google Scholar] [CrossRef]
  8. He, C.; Song, Y.; Lei, Q.; Lü, X.; Liu, R.; Chen, J. Integrated scheduling of multiple automated guided vehicles and machines in flexible job-shop. China Mech. Eng. 2019, 30, 438–447. [Google Scholar] [CrossRef]
  9. Wang, Y.; Liu, Y.; Wu, Y.; Li, S.; Zong, W. Improved NSGA-II Algorithm for Energy-Saving Flexible Job-Shop Scheduling Considering Transportation Constraints. Comput. Integr. Manuf. Syst. 2023, 29, 3028–3040. [Google Scholar] [CrossRef]
  10. Wang, Y.; Yang, M.; Zhang, G. Research on Path Planning for Automated Guided Vehicles Based on Improved A* Algorithm. Fire Control Command Control 2021, 46, 130–138+144. [Google Scholar] [CrossRef]
  11. Zhao, X.; Ye, H.; Jia, W.; Sun, Z. Research review of agv path planning and obstacle avoidance algorithms. J. Chin. Comput. Syst. 2024, 45, 529–541. [Google Scholar] [CrossRef]
  12. Guo, C.; Chen, X.; Guo, P.; Wang, Q.; Wang, S. Conflict-Free Path Planning for Multiple AGVs Based on Spatiotemporal Astar Algorithm. Comput. Syst. Appl. 2022, 31, 360–368. [Google Scholar] [CrossRef]
  13. Huang, X.; Chen, S.; Zhou, T.; Sun, Y. Review of Genetic Algorithms for Flexible Job-Shop Scheduling. Comput. Integr. Manuf. Syst. 2022, 28, 536–551. [Google Scholar] [CrossRef]
  14. Yu, N.; Li, T.; Wang, B.; Yuan, S. Multi-AGV Scheduling and Path Planning in Automated Sorting Warehouses. Comput. Integr. Manuf. Syst. 2020, 26, 171–180. [Google Scholar] [CrossRef]
  15. Chen, K.; Bi, L.; Wang, W. Research on Integrated Scheduling of AGV and Machines in Flexible Job-Shop. J. Syst. Simul. 2022, 34, 461–469. [Google Scholar] [CrossRef]
  16. Homayouni, S.; Fontes, D. Production and transport scheduling in flexible job shop manufacturing systems. J. Glob. Optim. 2021, 79, 463–502. [Google Scholar] [CrossRef]
  17. Song, C. Improved NSGA-II for Multi-Objective Hybrid Flow Shop Scheduling. Comput. Integr. Manuf. Syst. 2022, 28, 1777–1789. [Google Scholar] [CrossRef]
  18. Zhang, W.; Hu, M.; Li, J.; Zhang, J. Machine-AGV Collaborative Scheduling in Flexible Job-Shop Based on Multi-Agent Non-Cooperative-Evolutionary Game. Comput. Integr. Manuf. Syst. 2024, 1–19. [Google Scholar] [CrossRef]
  19. Zhang, F.; Li, J. An Improved Particle Swarm Optimization Algorithm for Integrated Scheduling Model in AGV-Served Manufacturing Systems. J. Adv. Manuf. Syst. 2018, 17, 375–390. [Google Scholar] [CrossRef]
  20. Xiao, Z.; Cheng, S.; Zheng, D.; Yan, J.; Lou, P.; Wang, X. Digital Twin-Driven Path Planning for AGVs in Workshops. Comput. Integr. Manuf. Syst. 2023, 29, 1905–1915. [Google Scholar] [CrossRef]
  21. Luo, X.; Qian, Q.; Fu, Y. Application Review of Genetic Algorithms for Flexible Job-Shop Scheduling. Comput. Eng. Appl. 2019, 55, 15–21+34. [Google Scholar] [CrossRef]
  22. Li, X.; Nan, K.; Zhao, Z.; Wang, X.; Jing, J. Task Allocation of Handling Robots in Textile Workshops Based on Multi-Agent Game. J. Text. Res. 2020, 41, 78–87. [Google Scholar] [CrossRef]
  23. Yu, H.; Bai, H.; Li, C. Path Planning Research and Simulation for Warehouse-Type Multi-AGV Systems. Comput. Eng. Appl. 2020, 56, 233–241. [Google Scholar] [CrossRef]
  24. Chen, X.; Hou, Z.; Guo, L.; Luo, W. Improved Multi-Objective Genetic Algorithm Based on NSGA-II. Comput. Appl. 2006, 26, 2453–2456. [Google Scholar]
  25. Yu, H.; Wang, Y.; Huang, Y. Research on path planning and task scheduling for multiple agvs. J. Shanghai Univ. Electr. Power 2022, 38, 89–93+97. [Google Scholar] [CrossRef]
  26. Wang, H.; Yin, P.; Zheng, W.; Wang, H.; Zuo, J. Path planning for mobile robots based on improved a* algorithm and dynamic window approach. Robot 2020, 42, 346–353. [Google Scholar] [CrossRef]
  27. Li, S.; Song, Q.; Li, Z.; Zhang, X.; Zhe, L. Research Review of Genetic Algorithms in Robot Path Planning. Sci. Technol. Eng. 2020, 20, 423–431. [Google Scholar] [CrossRef]
Figure 1. Multi-agent shop scheduling process.
Figure 1. Multi-agent shop scheduling process.
Symmetry 17 01368 g001
Figure 2. Multi-intelligence system framework diagram.
Figure 2. Multi-intelligence system framework diagram.
Symmetry 17 01368 g002
Figure 3. Comparison chart of the fitness of the three algorithms in a simple workshop with 20 workpieces.
Figure 3. Comparison chart of the fitness of the three algorithms in a simple workshop with 20 workpieces.
Symmetry 17 01368 g003
Figure 4. Comparison chart of the fitness of the three algorithms in a simple workshop with 100 workpieces.
Figure 4. Comparison chart of the fitness of the three algorithms in a simple workshop with 100 workpieces.
Symmetry 17 01368 g004
Figure 5. Comparison chart of the completion times of the three algorithms in a simple workshop with 20 workpieces.
Figure 5. Comparison chart of the completion times of the three algorithms in a simple workshop with 20 workpieces.
Symmetry 17 01368 g005
Figure 6. Comparison chart of the completion times of the three algorithms in a simple workshop with 100 workpieces.
Figure 6. Comparison chart of the completion times of the three algorithms in a simple workshop with 100 workpieces.
Symmetry 17 01368 g006
Figure 7. Comparison chart of the idle rates of the three algorithms in a simple workshop with 20 workpieces.
Figure 7. Comparison chart of the idle rates of the three algorithms in a simple workshop with 20 workpieces.
Symmetry 17 01368 g007
Figure 8. Comparison chart of the idle rates of the three algorithms in a simple workshop with 100 workpieces.
Figure 8. Comparison chart of the idle rates of the three algorithms in a simple workshop with 100 workpieces.
Symmetry 17 01368 g008
Figure 9. Comparison chart of the total energy consumption of the three algorithms in a simple workshop with 20 workpieces.
Figure 9. Comparison chart of the total energy consumption of the three algorithms in a simple workshop with 20 workpieces.
Symmetry 17 01368 g009
Figure 10. Comparison chart of the total energy consumption of the three algorithms in a simple workshop with 100 workpieces.
Figure 10. Comparison chart of the total energy consumption of the three algorithms in a simple workshop with 100 workpieces.
Symmetry 17 01368 g010
Figure 11. Comparison chart of the fitness of the three algorithms in a complex workshop with 40 workpieces.
Figure 11. Comparison chart of the fitness of the three algorithms in a complex workshop with 40 workpieces.
Symmetry 17 01368 g011
Figure 12. Comparison chart of the fitness of the three algorithms in a complex workshop with 100 workpieces.
Figure 12. Comparison chart of the fitness of the three algorithms in a complex workshop with 100 workpieces.
Symmetry 17 01368 g012
Figure 13. Comparison chart of the completion times of the three algorithms in a complex workshop with 40 workpieces.
Figure 13. Comparison chart of the completion times of the three algorithms in a complex workshop with 40 workpieces.
Symmetry 17 01368 g013
Figure 14. Comparison chart of the completion times of the three algorithms in a complex workshop with 100 workpieces.
Figure 14. Comparison chart of the completion times of the three algorithms in a complex workshop with 100 workpieces.
Symmetry 17 01368 g014
Figure 15. Comparison chart of the idle rates of the three algorithms in a complex workshop with 40 workpieces.
Figure 15. Comparison chart of the idle rates of the three algorithms in a complex workshop with 40 workpieces.
Symmetry 17 01368 g015
Figure 16. Comparison chart of the idle rates of the three algorithms in a complex workshop with 100 workpieces.
Figure 16. Comparison chart of the idle rates of the three algorithms in a complex workshop with 100 workpieces.
Symmetry 17 01368 g016
Figure 17. Comparison chart of the total energy consumption of the three algorithms in a complex workshop with 40 workpieces.
Figure 17. Comparison chart of the total energy consumption of the three algorithms in a complex workshop with 40 workpieces.
Symmetry 17 01368 g017
Figure 18. Comparison chart of the total energy consumption of the three algorithms in a complex workshop with 100 workpieces.
Figure 18. Comparison chart of the total energy consumption of the three algorithms in a complex workshop with 100 workpieces.
Symmetry 17 01368 g018
Figure 19. Comparison of fitness value ablation experiments with 100 workpieces in the same scenario.
Figure 19. Comparison of fitness value ablation experiments with 100 workpieces in the same scenario.
Symmetry 17 01368 g019
Figure 20. Comparison of fitness value ablation experiments with 40 workpieces in the same scenario.
Figure 20. Comparison of fitness value ablation experiments with 40 workpieces in the same scenario.
Symmetry 17 01368 g020
Figure 21. Comparison of completion time ablation experiments with 100 workpieces in the same scenario.
Figure 21. Comparison of completion time ablation experiments with 100 workpieces in the same scenario.
Symmetry 17 01368 g021
Figure 22. Comparison of completion time ablation experiments with 40 workpieces in the same scenario.
Figure 22. Comparison of completion time ablation experiments with 40 workpieces in the same scenario.
Symmetry 17 01368 g022
Figure 23. Comparison of empty driving rate ablation experiments with 100 workpieces in the same scenario.
Figure 23. Comparison of empty driving rate ablation experiments with 100 workpieces in the same scenario.
Symmetry 17 01368 g023
Figure 24. Comparison of empty driving rate ablation experiments with 40 workpieces in the same scenario.
Figure 24. Comparison of empty driving rate ablation experiments with 40 workpieces in the same scenario.
Symmetry 17 01368 g024
Figure 25. Comparison of total energy consumption ablation experiments with 100 workpieces in the same scenario.
Figure 25. Comparison of total energy consumption ablation experiments with 100 workpieces in the same scenario.
Symmetry 17 01368 g025
Figure 26. Comparison of total energy consumption ablation experiments with 40 workpieces in the same scenario.
Figure 26. Comparison of total energy consumption ablation experiments with 40 workpieces in the same scenario.
Symmetry 17 01368 g026
Table 1. Room for strategy.
Table 1. Room for strategy.
Agent TypeStrategy Space S j Key State Variables
Machine AgentWhether to accept the current workpiece J i { 0 , 1 } Queue length Q j , idle time T i d l e
AGV AgentTransportation path selection R k P Remaining power E k , current position ( x k , y k )
WorkpieceSelect processing equipment and transportation path ( M A j , A G V k ) Current position ( x i , y i ) , process constraint O i
Table 2. Comprehensive performance comparison of the HG-GA in manufacturing workshops.
Table 2. Comprehensive performance comparison of the HG-GA in manufacturing workshops.
ScenarioWorkpiecesMetricAlgorithm PerformanceImprovement (%)
HG-GATGAPSOvs. TGAvs. PSO
Simple20Fitness250.0315.0270.020.67.4
Makespan (min)50.0110.090.054.544.4
AGV Idle Rate (%)9.530.030.068.368.3
Energy (kWh)780.0925.0875.015.710.9
100Fitness295.0385.0320.023.47.8
Makespan (min)300.0500.0480.040.037.5
AGV Idle Rate (%)13.030.035.056.762.9
Energy (kWh)4000.05000.04800.020.016.7
Complex40Fitness550.0780.0700.029.521.4
Makespan (min)60.0140.0140.057.157.1
AGV Idle Rate (%)13.040.058.067.577.6
Energy (kWh)1800.02400.02200.025.018.2
100Fitness1500.01900.01700.021.111.8
Makespan (min)300.0450.0450.033.333.3
AGV Idle Rate (%)13.040.058.067.577.6
Energy (kWh)4500.06500.05500.030.818.2
HG-GA: proposed hybrid game–genetic algorithm; TGA: traditional genetic algorithm; PSO: particle swarm optimization. Improvement calculated as Baseline HG-GA Baseline × 100 % . All values are averages from 50 independent trials with a standard deviation < 5% of the mean.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xie, W.; Du, B.; Ma, J.; Chen, J.; Zheng, X. Research on a Multi-Agent Job Shop Scheduling Method Based on Improved Game Evolution. Symmetry 2025, 17, 1368. https://doi.org/10.3390/sym17081368

AMA Style

Xie W, Du B, Ma J, Chen J, Zheng X. Research on a Multi-Agent Job Shop Scheduling Method Based on Improved Game Evolution. Symmetry. 2025; 17(8):1368. https://doi.org/10.3390/sym17081368

Chicago/Turabian Style

Xie, Wei, Bin Du, Jiachen Ma, Jun Chen, and Xiangle Zheng. 2025. "Research on a Multi-Agent Job Shop Scheduling Method Based on Improved Game Evolution" Symmetry 17, no. 8: 1368. https://doi.org/10.3390/sym17081368

APA Style

Xie, W., Du, B., Ma, J., Chen, J., & Zheng, X. (2025). Research on a Multi-Agent Job Shop Scheduling Method Based on Improved Game Evolution. Symmetry, 17(8), 1368. https://doi.org/10.3390/sym17081368

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop