Optimization of Dynamic Scheduling for Flexible Job Shops Using Multi-Agent Deep Reinforcement Learning
Abstract
1. Introduction
2. Current Research Status
2.1. FJSP Current Research Status
2.2. Necessity and Core Challenges of Dynamic Scheduling
2.3. Current Research Status of Multi-Agent Deep Reinforcement Learning
3. Problem Overview and Theoretical Analysis
3.1. Mathematical Model for the FJSP
- (1)
- Job Sequence Constraint: Ensures that each subsequent operation of a job commences only after the preceding operation has been completed.
- (2)
- Machine Assignment Uniqueness: Each operation must be exclusively assigned to a single candidate machine.
- (3)
- Resource Non-Conflict Constraint: The processing time intervals of any two operations assigned to the same machine must not overlap.
- (4)
- Processing Time Association: Operations are processed without interruption; processing is non-preemptive.
- (5)
- Non-Negative Time Constraint: All start times, processing times, and completion times must be non-negative.
3.2. FJSP Solving Model Based on 3DQN Algorithm
3.3. Description and Analysis of the Flexible Job Shop Dynamic Scheduling Problem
3.4. Transformation of Dynamic Flexible Job-Shop Scheduling Problems
3.4.1. Dynamic Rescheduling Strategy
3.4.2. Dynamic Scheduling Based on Multi-Agent Deep Reinforcement Learning
4. Experimental Design and Results
4.1. Experimental Setup
4.2. Experimental Design
4.3. Results Analysis
4.3.1. Machine Failure Rescheduling Experiment
4.3.2. Order Insertion Rescheduling Experiment
5. Conclusions
- (1)
- This study optimizes maximum completion time, minimizes total energy consumption, and reduces overall mechanical load by introducing dynamic events (such as machine failures and order insertions) and implementing event-triggered rescheduling strategies. Two approaches, right-shift rescheduling and full rescheduling, are employed to address these challenges, achieving dynamic decoupling of scheduling decisions and global optimization.
- (2)
- A multi-agent collaboration mechanism was designed to decouple the strong coupling between process prioritization and machine allocation through division of labor between workpiece selection agents and machine assignment agents. The 3DQN algorithm was introduced to optimize Q-value function estimation, combining the state-action decoupling advantage of DuDQN with DoDQN’s capability to prevent overestimation of Q-values. Hierarchical experience replays and target network synchronization mechanisms were adopted to enhance the stability of the algorithm training process.
- (3)
- Simulation experiments demonstrate that the 3DQN algorithm outperforms traditional DoDQN and DuDQN algorithms in Pareto frontier solution distribution across standard test cases for scenarios involving machine failures and emergency order insertions, particularly when combined with event-triggered rescheduling strategies (right-shift rescheduling and full rescheduling). The results indicate that the full rescheduling strategy can significantly reduce completion time under sudden disturbances, achieving a balance between production flexibility and efficiency.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Zhou, J. Intelligent manufacturing—Main direction of “Made in China 2025”. China Mech. Eng. 2015, 26, 2273–2284. [Google Scholar]
- Wang, C. Research on Multi-Objective Flexible Job Shop Scheduling Model and Evolutionary Algorithm. Master’s Thesis, Jiangnan University, Wuxi, China, 2018; pp. 13–14. [Google Scholar]
- Bragin, M.A.; Luh, P.B.; Yan, J.H.; Yu, N.; Stern, G.A. Convergence of the surrogate Lagrangian relaxation method. J. Optim. Theory Appl. 2015, 164, 173–201. [Google Scholar] [CrossRef]
- Zhu, X.; Xu, J.; Ge, J.; Wang, Y.; Xie, Z. Multi-task multi-agent reinforcement learning for real-time scheduling of a dual-resource flexible job shop with robots. Processes 2023, 11, 267. [Google Scholar] [CrossRef]
- Zhang, S.; Qiu, B.; Shan, J.; Long, Q. Vehicle routing optimization and algorithm research considering driver fatigue. Ind. Eng. 2023, 26, 132–140, 184. [Google Scholar]
- Brucker, P.; Schlie, R. Job-shop scheduling with multi-purpose machines. Computing 1990, 45, 369–375. [Google Scholar] [CrossRef]
- Hou, Y.; Liao, X.; Chen, G.; Chen, Y. Co-evolutionary NSGA-III with deep reinforcement learning for multi-objective distributed flexible job shop scheduling. Comput. Ind. Eng. 2025, 203, 110990. [Google Scholar] [CrossRef]
- Tang, H.T.; Liu, X.; Zhang, W.; Lei, D.M.; Wang, K.P. An enhanced Q-learning-based artificial bee colony algorithm for green scheduling in distributed flexible assembly workshops. Ind. Eng. Manag. 2024, 29, 166–179. [Google Scholar]
- Wei, G.; Ye, C. A hybrid distribution estimation algorithm is proposed to solve the distributed flexible shop floor scheduling problem. Oper. Res. Manag. Sci. 2024, 33, 51–57. [Google Scholar]
- Pezzella, F.; Morganti, G.; Ciaschetti, G. A genetic algorithm for the flexible job-shop scheduling problem. Comput. Oper. Res. 2008, 35, 3202–3212. [Google Scholar] [CrossRef]
- Lei, J. Research on Flexible Job Shop Scheduling Method Based on Multi-Agent. Master’s Thesis, Hefei University of Technology, Hefei, China, 2023; pp. 15–16. [Google Scholar]
- Ge, T.Z. Flexible Job Shop Energy Efficiency Scheduling Optimization Based on DQN Cooperative Coevolution Algorithm. Master’s Thesis, Wuhan University of Science and Technology, Wuhan, China, 2023; pp. 19–22. [Google Scholar]
- Zhang, G.; Gao, L.; Shi, Y. An effective genetic algorithm for the flexible job-shop scheduling problem. Expert Syst. Appl. 2011, 38, 3563–3573. [Google Scholar] [CrossRef]
- Xing, L.N.; Chen, Y.W.; Wang, P.; Zhao, Q.S.; Xiong, J. A knowledge-based ant colony optimization for flexible job shop scheduling problems. Appl. Soft Comput. 2010, 10, 888–896. [Google Scholar] [CrossRef]
- Zhang, W.W.; Hu, M.Z.; Li, J.W.; Zhang, J. Machine-AGV collaborative scheduling in flexible job shop based on multi-agent non-cooperative evolutionary game. Comput. Integr. Manuf. Syst. 2025, 41, 13–15. [Google Scholar]
- Jiang, X.Y.; Chen, J.Q.; Wang, L.Q.; Xu, W.H. Scheduling optimization of mixed-model assembly lines for machine tools based on deep multi-agent reinforcement learning. Ind. Eng. J. 2025, 10, 15–16. [Google Scholar]
- Liu, Y.F.; Li, C.; Wang, Z.; Wang, J.L. Research progress on multi-agent deep reinforcement learning and its scalability. Comput. Eng. Appl. 2025, 61, 1–24. [Google Scholar] [CrossRef]
- Li, Y.C.; Liu, Z.J.; Hong, Y.T.; Wang, J.C.; Wang, J.R.; Li, Y.; Tang, Y. A survey on games based on multi-agent reinforcement learning. Acta Autom. Sin. 2025, 51, 540–558. [Google Scholar]
- Ma, X.Y. Optimization of cruising taxi scheduling strategy based on multi-agent deep reinforcement learning mechanism. Acta Geod. Cartogr. Sin. 2024, 53, 778. [Google Scholar]
- Zhang, C.; Xu, Y.W.; Li, W.J.; Wang, W.; Zhang, G.X. Research on LEO constellation beam hopping resource scheduling based on multi-agent deep reinforcement learning. J. Commun. 2025, 46, 35–51. [Google Scholar]
- Tremblet, D.; Thevenin, S.; Dolgui, A. Makespan estimation in a flexible job-shop scheduling environment using machine learning. Int. J. Prod. Res. 2024, 62, 3654–3670. [Google Scholar] [CrossRef]
- Harb, H.; Hijazi, M.; Brahmia, M.E.A.; Idrees, A.K.; AlAkkoumi, M.; Jaber, A.; Abouaissa, A. An intelligent mechanism for energy consumption scheduling in smart buildings. Clust. Comput. 2024, 27, 11149–11165. [Google Scholar] [CrossRef]
- Lv, Z.G.; Zhang, L.H.; Wang, X.Y.; Wang, J.B. Single machine scheduling proportionally deteriorating jobs with ready times subject to the total weighted completion time minimization. Mathematics 2024, 12, 610. [Google Scholar] [CrossRef]
- Li, J.; Chen, Y.; Zhao, X.; Huang, J. An improved DQN path planning algorithm. J. Supercomput. 2022, 78, 616–639. [Google Scholar] [CrossRef]
- Yu, Y.; Liu, Y.; Wang, J.; Noguchi, N.; He, Y. Obstacle avoidance method based on double DQN for agricultural robots. Comput. Electron. Agric. 2023, 204, 107546. [Google Scholar] [CrossRef]







| Notation | Definition |
|---|---|
| Machine failure rate (average number of failures per unit time) | |
| Emergency order insertion period | |
| Repair duration | |
| Emergency order start time | |
| An indicator variable taking the value 1 if an operation is assigned to a machine at time , and 0 otherwise | |
| and | start time and completion time of an operation , respectively |
| Intelligent Agent Name | Job Description | Input Contents | Output Forms |
|---|---|---|---|
| Process selection intelligent agent | Dynamically adjust the processing priority of the process | Equipment load, remaining time of process, emergency order status | Process scheduling strategy (priority sorting) |
| Machine selection intelligent agent | Optimization of Machine Allocation for Processing Tasks | Machine state vector (availability, failure probability, energy consumption) | -value distribution of machine selection actions |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, J.; Li, R.; Wang, Q. Optimization of Dynamic Scheduling for Flexible Job Shops Using Multi-Agent Deep Reinforcement Learning. Processes 2025, 13, 4045. https://doi.org/10.3390/pr13124045
Wang J, Li R, Wang Q. Optimization of Dynamic Scheduling for Flexible Job Shops Using Multi-Agent Deep Reinforcement Learning. Processes. 2025; 13(12):4045. https://doi.org/10.3390/pr13124045
Chicago/Turabian StyleWang, Jianqi, Renwang Li, and Qiang Wang. 2025. "Optimization of Dynamic Scheduling for Flexible Job Shops Using Multi-Agent Deep Reinforcement Learning" Processes 13, no. 12: 4045. https://doi.org/10.3390/pr13124045
APA StyleWang, J., Li, R., & Wang, Q. (2025). Optimization of Dynamic Scheduling for Flexible Job Shops Using Multi-Agent Deep Reinforcement Learning. Processes, 13(12), 4045. https://doi.org/10.3390/pr13124045
