MDPI - Publisher of Open Access Journals

26 pages, 2510 KB

Open AccessArticle

GA-HPO PPO: A Hybrid Algorithm for Dynamic Flexible Job Shop Scheduling

by Yiming Zhou, Jun Jiang, Qining Shi, Maojie Fu, Yi Zhang, Yihao Chen and Longfei Zhou

Sensors 2025, 25(21), 6736; https://doi.org/10.3390/s25216736 - 4 Nov 2025

Viewed by 357

The Job Shop Scheduling Problem (JSP), a classical NP-hard challenge, has given rise to various complex extensions to accommodate modern manufacturing requirements. Among them, the Dynamic Flexible Job Shop Scheduling Problem (DFJSP) remains particularly challenging, due to its stochastic task arrivals, heterogeneous deadlines, [...] Read more.

The Job Shop Scheduling Problem (JSP), a classical NP-hard challenge, has given rise to various complex extensions to accommodate modern manufacturing requirements. Among them, the Dynamic Flexible Job Shop Scheduling Problem (DFJSP) remains particularly challenging, due to its stochastic task arrivals, heterogeneous deadlines, and varied task types. Traditional optimization- and rule-based approaches often fail to capture these dynamics effectively. To address this gap, this study proposes a hybrid algorithm, GA-HPO PPO, tailored for the DFJSP. The method integrates genetic-algorithm–based hyperparameter optimization with proximal policy optimization to enhance learning efficiency and scheduling performance. The algorithm was trained on four datasets and evaluated on ten benchmark datasets widely adopted in DFJSP research. Comparative experiments against Double Deep Q-Network (DDQN), standard PPO, and rule-based heuristics demonstrated that GA-HPO PPO consistently achieved superior performance. Specifically, it reduced the number of overdue tasks by an average of 18.5 in 100-task scenarios and 197 in 1000-task scenarios, while maintaining a machine utilization above 67% and 28% in these respective scenarios, and limiting the makespan to within 108–114 and 506–510 time units. The model also demonstrated a 25% faster convergence rate and 30% lower variance in performance across unseen scheduling instances compared to standard PPO, confirming its robustness and generalization capability across diverse scheduling conditions. These results indicate that GA-HPO PPO provides an effective and scalable solution for the DFJSP, contributing to improved dynamic scheduling optimization in practical manufacturing environments. Full article

(This article belongs to the Special Issue Intelligent Sensing and Decision-Making in Advanced Manufacturing: 2nd Edition)

► Show Figures

Figure 1

31 pages, 6262 KB

Open AccessArticle

Profit-Oriented Multi-Objective Dynamic Flexible Job Shop Scheduling with Multi-Agent Framework Under Uncertain Production Orders

by Qingyao Ma, Yao Lu and Huawei Chen

Machines 2025, 13(10), 932; https://doi.org/10.3390/machines13100932 - 9 Oct 2025

Viewed by 481

Abstract

In the highly competitive manufacturing environment, customers are increasingly demanding punctual, flexible, and customized deliveries, compelling enterprises to balance profit, energy efficiency, and production performance while seeking new scheduling methods to enhance dynamic responsiveness. Although deep reinforcement learning (DRL) has made progress in [...] Read more.

In the highly competitive manufacturing environment, customers are increasingly demanding punctual, flexible, and customized deliveries, compelling enterprises to balance profit, energy efficiency, and production performance while seeking new scheduling methods to enhance dynamic responsiveness. Although deep reinforcement learning (DRL) has made progress in dynamic flexible job shop scheduling, existing research has rarely addressed profit-oriented optimization. To tackle this challenge, this paper proposes a novel multi-objective dynamic flexible job shop scheduling (MODFJSP) model that aims to maximize net profit and minimize makespan on the basis of traditional FJSP. The model incorporates uncertainties such as new job insertions, fluctuating due dates, and high-profit urgent jobs, and establishes a multi-agent collaborative framework consisting of “job selection–machine assignment.” For the two types of agents, this paper proposes adaptive state representations, reward functions, and variable action spaces to achieve the dual optimization objectives. The experimental results show that the double deep Q-network (DDQN), within the multi-agent cooperative framework, outperforms PPO, DQN, and classical scheduling rules in terms of solution quality and robustness. It achieves superior performance on multiple metrics such as IGD, HV, and SC, and generates bi-objective Pareto frontiers that are closer to the ideal point. The results demonstrate the effectiveness and practical value of the proposed collaborative framework for solving MODFJSP. Full article

(This article belongs to the Section Industrial Systems)

► Show Figures

Figure 1

25 pages, 1392 KB

Open AccessArticle

Dynamic Scheduling for Multi-Objective Flexible Job Shops with Machine Breakdown by Deep Reinforcement Learning

by Rui Wu, Jianxin Zheng and Xiyan Yin

Processes 2025, 13(4), 1246; https://doi.org/10.3390/pr13041246 - 20 Apr 2025

Viewed by 2486

Abstract

Dynamic scheduling for flexible job shops under machine breakdown is a complex and challenging problem due to its valuable application in real-life productions. However, prior studies have struggled to perform well in changeable scenarios. To address this challenge, this paper introduces a dual-objective [...] Read more.

Dynamic scheduling for flexible job shops under machine breakdown is a complex and challenging problem due to its valuable application in real-life productions. However, prior studies have struggled to perform well in changeable scenarios. To address this challenge, this paper introduces a dual-objective deep reinforcement learning (DRL) to solve this problem. This algorithm is based on the Double Deep Q-network (DDQN) and incorporates the attention mechanism. It decouples action relationships in the action space to reduce problem dimensionality and introduces an adaptive weighting method in agent decision-making to obtain high-quality Pareto front solutions. The algorithm is evaluated on a set of benchmark instances and compared with state-of-the-art algorithms. The experimental results show that the proposed algorithm outperforms the state-of-the-art algorithms regarding machine offset and total tardiness, demonstrating more excellent stability and higher-quality solutions. At the same time, the actual use of the algorithm is verified using cases from real enterprises, and the results are still better than those of the multi-objective meta-heuristic algorithm. Full article

(This article belongs to the Special Issue Transfer Learning Methods in Equipment Reliability Management)

► Show Figures

Figure 1

28 pages, 10210 KB

Open AccessArticle

A Dynamic Scheduling Method Combining Iterative Optimization and Deep Reinforcement Learning to Solve Sudden Disturbance Events in a Flexible Manufacturing Process

by Jun Yan, Tianzuo Zhao, Tao Zhang, Hongyan Chu, Congbin Yang and Yueze Zhang

Mathematics 2025, 13(1), 4; https://doi.org/10.3390/math13010004 - 24 Dec 2024

Cited by 1 | Viewed by 2864

Abstract

Unpredictable sudden disturbances such as machine failure, processing time lag, and order changes increase the deviation between actual production and the planned schedule, seriously affecting production efficiency. This phenomenon is particularly severe in flexible manufacturing. In this paper, a dynamic scheduling method combining [...] Read more.

Unpredictable sudden disturbances such as machine failure, processing time lag, and order changes increase the deviation between actual production and the planned schedule, seriously affecting production efficiency. This phenomenon is particularly severe in flexible manufacturing. In this paper, a dynamic scheduling method combining iterative optimization and deep reinforcement learning (DRL) is proposed to address the impact of uncertain disturbances. A real-time DRL production environment model is established for the flexible job scheduling problem. Based on the DRL model, an agent training strategy and an autonomous decision-making method are proposed. An event-driven and period-driven hybrid dynamic rescheduling trigger strategy (HDRS) with four judgment mechanisms has been developed. The decision-making method and rescheduling trigger strategy solve the problem of how and when to reschedule for the dynamic scheduling problem. The data experiment results show that the trained DRL decision-making model can provide timely feedback on the adjusted scheduling arrangements for different-scale order problems. The proposed dynamic-scheduling decision-making method and rescheduling trigger strategy can achieve high responsiveness, quick feedback, high quality, and high stability for flexible manufacturing process scheduling decision making under sudden disturbance. Full article

► Show Figures

Figure 1

25 pages, 1844 KB

Open AccessArticle

Multi-Agent Reinforcement Learning for Extended Flexible Job Shop Scheduling

by Shaoming Peng, Gang Xiong, Jing Yang, Zhen Shen, Tariku Sinshaw Tamir, Zhikun Tao, Yunjun Han and Fei-Yue Wang

Machines 2024, 12(1), 8; https://doi.org/10.3390/machines12010008 - 22 Dec 2023

Cited by 10 | Viewed by 4702

Abstract

An extended flexible job scheduling problem is presented with characteristics of technology and path flexibility (dual flexibility), varied transportation time, and an uncertain environment. The scheduling can greatly increase efficiency and security in complex scenarios, e.g., distributed vehicle manufacturing, and multiple aircraft maintenance. [...] Read more.

An extended flexible job scheduling problem is presented with characteristics of technology and path flexibility (dual flexibility), varied transportation time, and an uncertain environment. The scheduling can greatly increase efficiency and security in complex scenarios, e.g., distributed vehicle manufacturing, and multiple aircraft maintenance. However, optimizing the scheduling puts forward higher requirements on accuracy, real time, and generalization, while subject to the curse of dimension and usually incomplete information. Various coupling relations among operations, stations, and resources aggravate the problem. To deal with the above challenges, we propose a multi-agent reinforcement learning algorithm where the scheduling environment is modeled as a decentralized partially observable Markov decision process. Each job is regarded as an agent that decides the next triplet, i.e., operation, station, and employed resource. This paper is novel in addressing the flexible job shop scheduling problem with dual flexibility and varied transportation time in consideration and proposing a double Q-value mixing (DQMIX) optimization algorithm under a multi-agent reinforcement learning framework. The experiments of our case study show that the DQMIX algorithm outperforms existing multi-agent reinforcement learning algorithms in terms of solution accuracy, stability, and generalization. In addition, it achieves better solution quality for larger-scale cases than traditional intelligent optimization algorithms. Full article

(This article belongs to the Topic Advanced Paradigms, Systems and Enabling Technologies for Product Life Cycle)

► Show Figures

Figure 1

23 pages, 3575 KB

Open AccessArticle

Dynamic Job-Shop Scheduling Based on Transformer and Deep Reinforcement Learning

by Liyuan Song, Yuanyuan Li and Jiacheng Xu

Processes 2023, 11(12), 3434; https://doi.org/10.3390/pr11123434 - 15 Dec 2023

Cited by 18 | Viewed by 6654

Abstract

The dynamic job-shop scheduling problem is a complex and uncertain task that involves optimizing production planning and resource allocation in a dynamic production environment. Traditional methods are limited in effectively handling dynamic events and quickly generating scheduling solutions; in order to solve this [...] Read more.

The dynamic job-shop scheduling problem is a complex and uncertain task that involves optimizing production planning and resource allocation in a dynamic production environment. Traditional methods are limited in effectively handling dynamic events and quickly generating scheduling solutions; in order to solve this problem, this paper proposes a solution by transforming the dynamic job-shop scheduling problem into a Markov decision process and leveraging deep reinforcement learning techniques. The proposed framework introduces several innovative components, which make full use of human domain knowledge and machine computing power, to realize the goal of man–machine collaborative decision-making. Firstly, we utilize disjunctive graphs as the state representation, capturing the complex relationships between various elements of the scheduling problem. Secondly, we select a set of dispatching rules through data envelopment analysis to form the action space, allowing for flexible and efficient scheduling decisions. Thirdly, the transformer model is employed as the feature extraction module, enabling effective capturing of state relationships and improving the representation power. Moreover, the framework incorporates the dueling double deep Q-network with prioritized experience replay, mapping each state to the most appropriate dispatching rule. Additionally, a dynamic target strategy with an elite mechanism is proposed. Through extensive experiments conducted on multiple examples, our proposed framework consistently outperformed traditional dispatching rules, genetic algorithms, and other reinforcement learning methods, achieving improvements of 15.98%, 17.98%, and 13.84%, respectively. These results validate the effectiveness and superiority of our approach in addressing dynamic job-shop scheduling problems. Full article

► Show Figures

Figure 1

17 pages, 2793 KB

Open AccessArticle

Multi-Objective Flexible Flow Shop Production Scheduling Problem Based on the Double Deep Q-Network Algorithm

by Hua Gong, Wanning Xu, Wenjuan Sun and Ke Xu

Processes 2023, 11(12), 3321; https://doi.org/10.3390/pr11123321 - 29 Nov 2023

Cited by 8 | Viewed by 3040

Abstract

In this paper, motivated by the production process of electronic control modules in the digital electronic detonators industry, we study a multi-objective flexible flow shop scheduling problem. The objective is to find a feasible schedule that minimizes both the makespan and the total [...] Read more.

In this paper, motivated by the production process of electronic control modules in the digital electronic detonators industry, we study a multi-objective flexible flow shop scheduling problem. The objective is to find a feasible schedule that minimizes both the makespan and the total tardiness. Considering the constraints imposed by the jobs and the machines throughout the manufacturing process, a mixed integer programming model is formulated. By transforming the scheduling problem into a Markov decision process, the agent state features and the actions are designed based on the processing status of the machines and the jobs, along with heuristic rules. Furthermore, a reward function based on the optimization objectives is designed. Based on the deep reinforcement learning algorithm, the Dueling Double Deep Q-Network (D3QN) algorithm is designed to solve the scheduling problem by incorporating the target network, the dueling network, and the experience replay buffer. The D3QN algorithm is compared with heuristic rules, the genetic algorithm (GA), and the optimal solutions generated by Gurobi. The ablation experiments are designed. The experimental results demonstrate the high performance of the D3QN algorithm with the target network and the dueling network proposed in this paper. The scheduling model and the algorithm proposed in this paper can provide theoretical support to make the production plan of electronic control modules reasonable and improve production efficiency. Full article

(This article belongs to the Special Issue Production Scheduling and Optimization Control on Advanced Manufacturing (2nd Edition))

► Show Figures

Figure 1

28 pages, 5779 KB

Open AccessArticle

Multi-Task Multi-Agent Reinforcement Learning for Real-Time Scheduling of a Dual-Resource Flexible Job Shop with Robots

by Xiaofei Zhu, Jiazhong Xu, Jianghua Ge, Yaping Wang and Zhiqiang Xie

Processes 2023, 11(1), 267; https://doi.org/10.3390/pr11010267 - 13 Jan 2023

Cited by 26 | Viewed by 8780

Abstract

In this paper, a real-time scheduling problem of a dual-resource flexible job shop with robots is studied. Multiple independent robots and their supervised machine sets form their own work cells. First, a mixed integer programming model is established, which considers the scheduling problems [...] Read more.

In this paper, a real-time scheduling problem of a dual-resource flexible job shop with robots is studied. Multiple independent robots and their supervised machine sets form their own work cells. First, a mixed integer programming model is established, which considers the scheduling problems of jobs and machines in the work cells, and of jobs between work cells, based on the process plan flexibility. Second, in order to make real-time scheduling decisions, a framework of multi-task multi-agent reinforcement learning based on centralized training and decentralized execution is proposed. Each agent interacts with the environment and completes three decision-making tasks: job sequencing, machine selection, and process planning. In the process of centralized training, the value network is used to evaluate and optimize the policy network to achieve multi-agent cooperation, and the attention mechanism is introduced into the policy network to realize information sharing among multiple tasks. In the process of decentralized execution, each agent performs multiple task decisions through local observations according to the trained policy network. Then, observation, action, and reward are designed. Rewards include global and local rewards, which are decomposed into sub-rewards corresponding to tasks. The reinforcement learning training algorithm is designed based on a double-deep Q-network. Finally, the scheduling simulation environment is derived from benchmarks, and the experimental results show the effectiveness of the proposed method. Full article

(This article belongs to the Special Issue Production Scheduling and Optimization Control on Advanced Manufacturing)

► Show Figures

Figure 1

25 pages, 4106 KB

Open AccessArticle

Hierarchical Reinforcement Learning for Multi-Objective Real-Time Flexible Scheduling in a Smart Shop Floor

by Jingru Chang, Dong Yu, Zheng Zhou, Wuwei He and Lipeng Zhang

Machines 2022, 10(12), 1195; https://doi.org/10.3390/machines10121195 - 9 Dec 2022

Cited by 31 | Viewed by 5449

Abstract

With the development of intelligent manufacturing, machine tools are considered the “mothership” of the equipment manufacturing industry, and the associated processing workshops are becoming more high-end, flexible, intelligent, and green. As the core of manufacturing management in a smart shop floor, research into [...] Read more.

With the development of intelligent manufacturing, machine tools are considered the “mothership” of the equipment manufacturing industry, and the associated processing workshops are becoming more high-end, flexible, intelligent, and green. As the core of manufacturing management in a smart shop floor, research into the multi-objective dynamic flexible job shop scheduling problem (MODFJSP) focuses on optimizing scheduling decisions in real time according to changes in the production environment. In this paper, hierarchical reinforcement learning (HRL) is proposed to solve the MODFJSP considering random job arrival, with a focus on achieving the two practical goals of minimizing penalties for earliness and tardiness and reducing total machine load. A two-layer hierarchical architecture is proposed, namely the combination of a double deep

Q

-network (DDQN) and a dueling DDQN (DDDQN), and state features, actions, and external and internal rewards are designed. Meanwhile, a personal computer-based interaction feature is designed to integrate subjective decision information into the real-time optimization of HRL to obtain a satisfactory compromise. In addition, the proposed HRL framework is applied to multi-objective real-time flexible scheduling in a smart gear production workshop, and the experimental results show that the proposed HRL algorithm outperforms other reinforcement learning (RL) algorithms, metaheuristics, and heuristics in terms of solution quality and generalization and has the added benefit of real-time characteristics. Full article

(This article belongs to the Section Advanced Manufacturing)

► Show Figures

Figure 1

20 pages, 3082 KB

Open AccessArticle

Deep Reinforcement Learning for Dynamic Flexible Job Shop Scheduling with Random Job Arrival

by Jingru Chang, Dong Yu, Yi Hu, Wuwei He and Haoyu Yu

Processes 2022, 10(4), 760; https://doi.org/10.3390/pr10040760 - 13 Apr 2022

Cited by 97 | Viewed by 11954

Abstract

The production process of a smart factory is complex and dynamic. As the core of manufacturing management, the research into the flexible job shop scheduling problem (FJSP) focuses on optimizing scheduling decisions in real time, according to the changes in the production environment. [...] Read more.

The production process of a smart factory is complex and dynamic. As the core of manufacturing management, the research into the flexible job shop scheduling problem (FJSP) focuses on optimizing scheduling decisions in real time, according to the changes in the production environment. In this paper, deep reinforcement learning (DRL) is proposed to solve the dynamic FJSP (DFJSP) with random job arrival, with the goal of minimizing penalties for earliness and tardiness. A double deep Q-networks (DDQN) architecture is proposed and state features, actions and rewards are designed. A soft ε-greedy behavior policy is designed according to the scale of the problem. The experimental results show that the proposed DRL is better than other reinforcement learning (RL) algorithms, heuristics and metaheuristics in terms of solution quality and generalization. In addition, the soft ε-greedy strategy reasonably balances exploration and exploitation, thereby improving the learning efficiency of the scheduling agent. The DRL method is adaptive to the dynamic changes of the production environment in a flexible job shop, which contributes to the establishment of a flexible scheduling system with self-learning, real-time optimization and intelligent decision-making. Full article

(This article belongs to the Topic Artificial Intelligence in Smart Industrial Diagnostics and Manufacturing)

► Show Figures

Figure 1

14 pages, 860 KB

Open AccessArticle

Hybrid Salp Swarm Algorithm for Solving the Green Scheduling Problem in a Double-Flexible Job Shop

by Changping Liu, Yuanyuan Yao and Hongbo Zhu

Appl. Sci. 2022, 12(1), 205; https://doi.org/10.3390/app12010205 - 25 Dec 2021

Cited by 21 | Viewed by 4066

Abstract

Green scheduling is not only an effective way to achieve green manufacturing but also an effective way for modern manufacturing enterprises to achieve energy conservation and emission reduction. The double-flexible job shop scheduling problem (DFJSP) considers both machine flexibility and worker flexibility, so [...] Read more.

Green scheduling is not only an effective way to achieve green manufacturing but also an effective way for modern manufacturing enterprises to achieve energy conservation and emission reduction. The double-flexible job shop scheduling problem (DFJSP) considers both machine flexibility and worker flexibility, so it is more suitable for practical production. First, a multi-objective mixed-integer programming model for the DFJSP with the objectives of optimizing the makespan, total worker costs, and total influence of the green production indicators is formulated. Considering the characteristics of the problem, three-layer salp individual encoding and decoding methods are designed for the multi-objective hybrid salp swarm algorithm (MHSSA), which is hybridized with the Lévy flight, the random probability crossover operator, and the mutation operator. In addition, the influence of the parameter setting on the MHSSA in solving the DFJSP is investigated by means of the Taguchi method of design of experiments. The simulation results for benchmark instances show that the MHSSA can effectively solve the proposed problem and is significantly better than the MSSA and the MOPSO algorithm in the diversity, convergence, and dominance of the Pareto frontier. Full article

(This article belongs to the Special Issue Artificial Intelligence and Optimization in Industry 4.0)

► Show Figures

Figure 1

Search Results (11)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (11)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI