A Policy-Based Rough Optimization with Large Neighborhood Search for Carbon-Aware Flexible Job Shop Scheduling with Tardiness Penalty
Abstract
1. Introduction
2. Literature Review
| Approach | Representative Studies | Objectives Addressed | Advantages | Scope and Limitations for CAFJSP-T | Relevance to Pro-LNS |
|---|---|---|---|---|---|
| Exact optimization, MILP, and CP | [27,29,43] | Makespan, energy, carbon, feasibility, and transportation | Provides rigorous formulations and useful bounds | Scalability becomes challenging as flexible routing, sequencing, carbon, and tardiness interact | Supports the CAFJSP-T model and warm-start evaluation logic |
| GA, NSGA-II, and evolutionary algorithms | [28,46,47] | Makespan, energy, carbon, cost, customer satisfaction, and tardiness | Provides broad exploration of multi-objective trade-offs | Often benefits from adaptive operators or local search in complex FJSP variants | Motivates hybrid search and post-construction refinement |
| Swarm and population-based heuristics | [49,50,53] | Energy, carbon, makespan, workload, and transport | Flexible and adaptable across green scheduling variants | Performance can depend on parameter settings and problem-specific operators | Supports the value of adaptive search in green scheduling |
| DQN and value-based DRL | [38,58,59] | Dynamic scheduling, dispatching, energy, carbon, and delay | Learns state-dependent scheduling behavior | Requires careful state and reward design, and may focus on specific objective sets | Supports learning-based scheduling but motivates PPO-based construction plus refinement |
| PPO and policy-gradient RL | [60,65,66] | Makespan, tardiness, dynamic scheduling, and multi-objective scheduling | Supports stable policy learning for sequential decisions and large action spaces | Policy output can still benefit from instance-specific post-processing | Justifies the rough optimization phase of Pro-LNS |
| Hybrid RL plus search | [71,72,73] | Scheduling, routing, energy-aware job shop, and combinatorial optimization | Combines learned guidance with local or neighborhood refinement | Often developed for makespan, routing, energy, or operator control rather than CAFJSP-T | Supports the Pro-LNS design logic |
| Tabu search, local search, and VNS | [78,79,81] | Sequencing, assignment, makespan, energy, and transportation | Strong local improvement around incumbent schedules | Small neighborhoods may not capture larger assignment-sequencing interactions | Motivates larger LNS-style destroy-and-repair refinement |
| LNS and ALNS | [82,84,85] | Energy, makespan, workload, dynamic scheduling, and transport | Preserves useful incumbent structure while reconstructing high-impact parts | Requires problem-specific destroy, repair, and acceptance rules | Directly supports the LNS phase of Pro-LNS |
| Candidate Refinement Method | Strengths | Scope and Limitations for CAFJSP-T Refinement | Why LNS Is Selected or Complementary |
|---|---|---|---|
| Exact reoptimization | Provides rigorous optimization and bounds [25,43] | Full reoptimization may become computationally expensive as routing, sequencing, carbon, and tardiness interact [44,45] | LNS performs targeted partial reconstruction while preserving the incumbent |
| GA or NSGA-II | Provides population-level exploration [46,47] | Detailed refinement may require many evaluations or embedded local search [48,86] | LNS intensifies around one promising PPO-generated schedule |
| PSO, ACO, and swarm methods | Offers flexible search over complex discrete spaces [49,50] | Search quality can depend on parameters, representation, and operators [51,52] | LNS provides a direct destroy-and-repair mechanism tied to objective impact |
| Standalone PPO or DRL | Produces fast adaptive schedules after training [37,60] | A policy-generated schedule may still contain improvable structures [71,72] | LNS refines high-impact operations without discarding the PPO-guided incumbent |
| Tabu search and small-neighborhood local search | Provides strong local improvement [78,79] | Small neighborhoods may not capture larger coordinated changes [80,81] | LNS expands the move scale by reconstructing groups of operations |
| LNS or ALNS | Supports incumbent-preserving large destroy-and-repair moves [82,84] | Requires tailored removal, repair, and acceptance logic [85,87] | Best matches Pro-LNS because the same carbon-tardiness objective can guide removal, reinsertion, and acceptance |
3. Methodology
3.1. Problem Definition
3.1.1. Problem Assumptions
- 1.
- Machines are continuously available throughout the scheduling horizon, with no downtime due to breakdowns or maintenance.
- 2.
- Operations are processed without interruption once started; that is, processing is non-preemptive.
- 3.
- Processing-time-dependent carbon emissions remain constant for each machine during active operation.
- 4.
- All jobs are available for processing at time zero, and job due dates are predetermined and fixed.
- 5.
- Tardiness penalties are deterministic and time-invariant over the scheduling horizon.
- 6.
- The study considers a deterministic and steady-state production environment, without uncertain processing times, machine failures, sequence-dependent setup times, dynamic job arrivals, or worker-related constraints.
- 7.
- Carbon-emission estimation is based on steady-state machine processing conditions and does not incorporate transient operating states, time-varying carbon intensity, or machine warm-up and cool-down effects.
3.1.2. Notations
3.1.3. Mathematical Formulation
3.1.4. Objective Scaling and Weighting
3.2. Policy-Based Rough Optimization with Large Neighborhood Search (Pro-LNS)
3.2.1. Phase I: MDP-Based Reinforcement Learning
State Space
- Ready-flag vector: Indicates which operations are currently eligible for dispatch, as shown in Equation (16):
- Machine ready-time vector: Records the earliest time at which each machine becomes available, as shown in Equation (17):
- Normalized earliest-completion-time matrix: Estimates the completion time of each eligible operation on each eligible machine, as shown in Equation (18):where denotes the k-th eligible machine for operation .
- Critical-path metrics: Encodes downstream workload and due-date slack as shown in Equation (19). Let denote the precomputed lower bound on the remaining processing time of job j from operation o onward, obtained as the sum of minimum eligible processing times of the remaining operations.denote the earliest machine-available time at decision step t. Then Equation (20)
Action Space
Transition Function
Reward Function
Learning Objective
3.2.2. Phase II: Adaptive Large Neighborhood Search (LNS)
Adaptive Removal
- 1.
- Marginal-impact scoring: For each scheduled operation , estimate its contribution to the scalarized objective by evaluating the changes in carbon emissions and tardiness penalty associated with removing and reinserting that operation. The combined score is computed as shown in Equation (29):
- 2.
- Removal: Remove the operations with the highest , producing a partial schedule in which the most disruptive operations are unscheduled.
- 3.
Greedy Reinsertion
- 1.
- Precedence constraint: Operation is considered for reinsertion only after its predecessor , if any, has already been reinserted.
- 2.
- Feasible start times: For each eligible machine, , compute the start and completion times, as shown in Equation (33):
- 3.
- Objective-based choice: For each eligible machine, evaluate , the increase in the scalarized objective if is inserted on machine m. The operation is then assigned to the machine shown in Equation (34):
Acceptance and Adaptation
Termination
3.2.3. Policy-Based Rough Optimization with Large Neighborhood Search
3.2.4. RL Architecture and Training Protocol
RL Architecture
| Algorithm 1 Policy-based Rough Optimization Neighborhood Search (Pro-LNS) |
|
Training Protocol
PPO Technical Details and Hyperparameter Selection
- 1.
- Policy representation: MLP, [256, 256], ReLU. An MLP is used because the CAFJSP-T state is represented as structured numerical inputs, including the ready-flag vector, machine ready-time vector, normalized earliest-completion-time matrix, and critical-path metrics. The two-layer 256-neuron architecture with ReLU provides sufficient nonlinear capacity to capture interactions among job readiness, machine availability, completion-time estimates, and path criticality while keeping training and inference efficient, consistent with deep actor-critic scheduling architectures that improve makespan, tardiness, utilization, and computational performance [92,93,94,95].
- 2.
- Optimization stability: learning rate , mini-batch size 64, 10 epochs, max gradient norm 0.5. These values are selected to make PPO updates conservative and stable, since small policy changes can strongly affect downstream sequencing and machine availability decisions. Minibatch, multi-epoch PPO training with constrained updates is commonly used in scheduling frameworks to improve convergence, robustness, makespan, and tardiness performance [64,69,93,96,97].
- 3.
- Exploration and update control: entropy coefficient 0.001, clipping range 0.2, advantage normalization. The entropy coefficient encourages limited exploration, while the 0.2 clipping range and advantage normalization prevent unstable policy updates. These mechanisms are important in scheduling because overly random or aggressive updates can degrade dispatching quality; prior PPO scheduling frameworks link clipped updates with stable learning and improved robustness [64,65,68,93].
- 4.
- Long-horizon credit assignment: , . These values are used because CAFJSP-T decisions have delayed effects on machine idle times, availability, completion times, and final objective values. High discounting and GAE-style advantage estimation are suitable for long-horizon scheduling and delayed-reward environments [37,56,69,98,99].
- 5.
- Actor-critic loss: clipped surrogate objective, MSE value loss, entropy regularization, value coefficient 0.5. This standard PPO loss is used to balance policy improvement, value-function learning, and exploration. Actor-critic PPO formulations are widely used in scheduling literature because they stabilize learning and improve solution quality under complex shop-floor dynamics [64,65,93,96,100].
- 6.
- Training scale: 500,000 timesteps and eight parallel environments. This training budget is used to expose the agent to diverse production states, while parallel environments improve rollout collection efficiency. Simulation-based PPO scheduling studies similarly rely on extended training experience to improve robustness, generalization, and computational efficiency [64,94,95,101,102,103].
3.3. Benchmark Instances and Experimental Setup
- Benchmark-based warm-start evaluation: The proposed Pro-LNS framework is applied to the full set of benchmark instances. For each instance, the final Pro-LNS solution is used to warm-start the MILP formulation of the same CAFJSP-T instance by providing it to the solver as an initial incumbent. The MILP solver is then run on the same instance to obtain a best bound and the corresponding optimality gap. This procedure is used to evaluate the quality of the Pro-LNS solution relative to the exact formulation and to quantify how close the final Pro-LNS schedule is to proven optimality within the allotted MILP solve time.
- Weight-sensitivity analysis: A weight-sensitivity analysis is conducted to examine how the schedule changes under different objective-function priorities. By varying the scalarization weights assigned to carbon emissions and tardiness penalty, the analysis is used to study how the resulting schedules respond to different relative priorities between the two objective components. This analysis illustrates the effect of the weighted objective structure on scheduling decisions.
- Representative-instance algorithm comparison: Pro-LNS is compared with a Proximal Policy Optimization (PPO)-only ablation, an Advantage Actor-Critic (A2C) Scheduler [108], a Soft Actor-Critic (SAC) Scheduler [109], and a Genetic Algorithm (GA) [106,107] using one representative instance from each small, medium, and large workcenter (WC) configuration. All methods use the same benchmark data, feasibility rules, category-specific normalization constants, and carbon-tardiness scalarized objective. PPO-only corresponds to Phase 1 of Pro-LNS without LNS refinement. A2C and SAC retain their original Markov decision process (MDP) and policy structures, with their objective evaluation adapted to CAFJSP-T. The GA baseline uses the same schedule encoding and objective calculation, with a population size of 150, 200 generations, adaptive crossover probability of 0.85, problem-specific mutation probability of 0.05, including critical-path mutation, and termination after the maximum generation limit or 50 generations without improvement. The comparison is conducted under multiple scalarization weights, and statistical tests are used across WC configurations and weight settings to assess whether the observed performance differences are significant.
4. Results
- Pro-LNS delivers strong due-date performance on a substantial portion of the benchmark set. Zero tardiness is achieved on sm01_1, sm01_3, med01_2, and lar01_1, and tardiness remains very small on med02_1, med02_5, and lar02_3. Thus, in 7 of the 15 reported instances, Pro-LNS produces schedules with either zero tardiness or only negligible delay while still controlling carbon emissions under the same equal-weight objective.
- The optimality-gap results indicate that the final Pro-LNS solutions are highly competitive with respect to the exact MILP formulation. Across the benchmark instances reported in Table 6, the median optimality gap of 6.12%, and the maximum gap is 13.67%. Moreover, 11 of the 15 instances remain within a 10% optimality gap, and all reported instances remain within 14%. Given that these gaps are computed on the same constrained Carbon-Aware Flexible Job Shop Scheduling Problem with Tardiness Penalty (CAFJSP-T) formulation after warm-starting the MILP solver with the final Pro-LNS solution, these values provide strong evidence that Pro-LNS produces high-quality incumbent solutions.
- The method remains computationally efficient across all benchmark categories. The average CPU time is 4.08 s, and the maximum reported CPU time is 10.51 s. This means that Pro-LNS is able to return competitive schedules with bounded optimality gaps in only a few seconds, which is especially valuable for complex flexible job shop environments where exact methods alone can become computationally burdensome.
- Pro-LNS preserves balanced performance under equal objective weighting. Even in instances where tardiness becomes more pronounced, the method continues to return feasible schedules with controlled carbon emissions, reasonable makespans, and moderate optimality gaps. This indicates that Pro-LNS does not sacrifice one objective uncontrollably in order to improve the other but instead maintains a balanced trade-off structure under the equal-weight formulation.
- From a managerial perspective, the results suggest that Pro-LNS is well suited for practical production planning in settings where sustainability and delivery reliability must be addressed together. The combination of a low runtime, controlled emissions, and relatively tight optimality gaps means that decision-makers can obtain strong schedules quickly while still retaining confidence that the solutions are relatively close to the benchmarks provided by the exact optimization frameworks. This makes the method especially applicable to low-carbon production planning, due-date-driven job shops, make-to-order manufacturing environments, and machine-flexible facilities where alternative machines differ in processing time, energy use, or carbon intensity. It is also useful in shops that require repeated rescheduling under limited planning time, such as when customer priorities change, bottlenecks emerge, or updated production plans must be generated during daily operations.
Computational Environment
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Li, Z.; Rasool, S.; Cavus, M.F.; Shahid, W. Sustaining the future: How green capabilities and digitalization drive sustainability in modern business. Heliyon 2024, 10, e24158. [Google Scholar] [CrossRef] [PubMed]
- El Mokadem, M.; Khalaf, M. Building sustainable performance through green supply chain management. Int. J. Product. Perform. Manag. 2024, 74, 203–223. [Google Scholar] [CrossRef]
- Mahar, A.S.; Zhang, Y.; Sadiq, B.; Gul, R.F. Sustainability Transformation Through Green Supply Chain Management Practices and Green Innovations in Pakistan’s Manufacturing and Service Industries. Sustainability 2025, 17, 2204. [Google Scholar] [CrossRef]
- Wang, M.; Zhang, G. What motivates firms to adopt a green supply chain and how much does it matter? Front. Environ. Sci. 2023, 11, 1227008. [Google Scholar] [CrossRef]
- Poggi, A.; Di Persio, L.; Ehrhardt, M. Electricity Price Forecasting via Statistical and Deep Learning Approaches: The German Case. AppliedMath 2023, 3, 316–342. [Google Scholar] [CrossRef]
- Narkhede, G.; Chinchanikar, S.; Narkhede, R.; Chaudhari, T. Role of Industry 5.0 for driving sustainability in the manufacturing sector: An emerging research agenda. J. Strategy Manag. 2024, ahead-of-print. [Google Scholar] [CrossRef]
- Ghobakhloo, M.; Iranmanesh, M.; Foroughi, B.; Tirkolaee, E.B.; Asadi, S.; Amran, A. Industry 5.0 implications for inclusive sustainable manufacturing: An evidence-knowledge-based strategic roadmap. J. Clean. Prod. 2023, 417, 138023. [Google Scholar] [CrossRef]
- Zheng, R.; Li, Z.; Li, L.; Li, S.; Li, X. Group Technology Empowering Optimization of Mixed-Flow Precast Production in Off-Site Construction. Environ. Sci. Pollut. Res. 2024, 31, 11781–11800. [Google Scholar] [CrossRef]
- Fu, Y.; Gao, K.; Wang, L.; Huang, M.; Liang, Y.; Dong, H. Scheduling Stochastic Distributed Flexible Job Shops Using a Multi-Objective Evolutionary Algorithm with Simulation Evaluation. Int. J. Prod. Res. 2024, 63, 86–103. [Google Scholar] [CrossRef]
- Tang, Y.; Shen, L.; Han, S. Low-Carbon Flexible Job Shop Scheduling Problem Based on Deep Reinforcement Learning. Sustainability 2024, 16, 4544. [Google Scholar] [CrossRef]
- Aghakhani, S.; Rajabi, M.S. A New Hybrid Multi-Objective Scheduling Model for Hierarchical Hub and Flexible Flow Shop Problems. AppliedMath 2022, 2, 721–737. [Google Scholar] [CrossRef]
- Destouet, C.; Tlahig, H.; Bettayeb, B.; Mazari, B. Flexible job shop scheduling problem under Industry 5.0: A survey on human reintegration, environmental consideration and resilience improvement. J. Manuf. Syst. 2023, 67, 155–173. [Google Scholar] [CrossRef]
- Zhou, K.; Tan, C.; Wu, Y.; Yang, B.; Long, X. Research on Low-Carbon Flexible Job Shop Scheduling Problem Based on Improved Grey Wolf Algorithm. J. Supercomput. 2024, 80, 12123–12153. [Google Scholar] [CrossRef]
- Gong, Q.; Li, J.; Jiang, Z.; Wang, Y. A hierarchical integration scheduling method for flexible job shop with green lot splitting. Eng. Appl. Artif. Intell. 2024, 129, 107595. [Google Scholar] [CrossRef]
- Mencaroni, A.; Leyman, P.; Raa, B.; De Vuyst, S.; Claeys, D. Towards net-zero manufacturing: Carbon-aware scheduling for GHG emissions reduction. J. Clean. Prod. 2025, 529, 146787. [Google Scholar] [CrossRef]
- Georgiadis, G.P.; Dimitriadis, C.N.; Georgiadis, M.C. Decarbonizing the Industry Sector: Current Status and Future Opportunities of Energy-Aware Production Scheduling. Processes 2025, 13, 1941. [Google Scholar] [CrossRef]
- Naidu, J.T. A New Algorithm for the Weighted Tardiness Problem. J. Appl. Bus. Econ. 2025, 27, 24. [Google Scholar] [CrossRef]
- de Athayde Prata, B.; de Abreu, L.R.; Fernandez-Viagas, V. A systematic review of permutation flow shop scheduling with due-date-related objectives. Comput. Oper. Res. 2025, 177, 106989. [Google Scholar] [CrossRef]
- Xiong, F.; Chen, S.; Xiong, N.; Jing, L. Scheduling distributed heterogeneous non-permutation flowshop to minimize the total weighted tardiness. Expert Syst. Appl. 2025, 272, 126713. [Google Scholar] [CrossRef]
- Ulucak, M.I.; Gökçen, H. Dynamic Scheduling in Identical Parallel-Machine Environments: A Multi-Purpose Intelligent Utility Approach. Appl. Sci. 2025, 15, 2483. [Google Scholar] [CrossRef]
- Meng, L.; Cheng, W.; Zhang, B.; Zou, W.; Duan, P. A novel hybrid algorithm of genetic algorithm, variable neighborhood search and constraint programming for distributed flexible job shop scheduling problem. Int. J. Ind. Eng. Comput. 2024, 15, 813–832. [Google Scholar] [CrossRef]
- Nessari, S.; Tavakkoli-Moghaddam, R.; Bakhshi-Khaniki, H.; Bozorgi-Amiri, A. A hybrid simheuristic algorithm for solving bi-objective stochastic flexible job shop scheduling problems. Decis. Anal. J. 2024, 11, 100485. [Google Scholar] [CrossRef]
- Seck-Tuoh-Mora, J.C.; Escamilla-Serna, N.J.; Montiel-Arrieta, L.J.; Barragán-Vite, I.; Medina-Marín, J. A Global Neighborhood with Hill-Climbing Algorithm for Fuzzy Flexible Job Shop Scheduling Problem. Mathematics 2022, 10, 4233. [Google Scholar] [CrossRef]
- Berterottiére, L.; Dauzére-Pérés, S.; Yugma, C. Flexible job-shop scheduling with transportation resources. Eur. J. Oper. Res. 2023, 312, 890–909. [Google Scholar] [CrossRef]
- Yang, S.; Meng, L.; Ullah, S.; Zhang, B.; Sang, H.; Duan, P. MILP Modeling and Optimization of Multi-Objective Three-Stage Flexible Job Shop Scheduling Problem With Assembly and AGV Transportation. IEEE Access 2025, 13, 25369–25386. [Google Scholar] [CrossRef]
- Fernandes, J.; Homayouni, S.; Fontes, D. Energy-Efficient Scheduling in Job Shop Manufacturing Systems: A Literature Review. Sustainability 2022, 14, 6264. [Google Scholar] [CrossRef]
- Li, Z.; Chen, Y.-H. Minimizing the makespan and carbon emissions in the green flexible job shop scheduling problem with learning effects. Sci. Rep. 2023, 13, 6369. [Google Scholar] [CrossRef]
- Jia, S.; Yang, Y.; Li, S.; Wang, S.; Li, A.; Cai, W.; Liu, Y.; Hao, J.; Hu, L. The Green Flexible Job-Shop Scheduling Problem Considering Cost, Carbon Emissions, and Customer Satisfaction under Time-of-Use Electricity Pricing. Sustainability 2024, 16, 2443. [Google Scholar] [CrossRef]
- Park, M.-J.; Ham, A. Energy-aware flexible job shop scheduling under time-of-use pricing. Int. J. Prod. Econ. 2022, 248, 108507. [Google Scholar] [CrossRef]
- Xu, G.; Bao, Q.; Zhang, H. Multi-objective green scheduling of integrated flexible job shop and automated guided vehicles. Eng. Appl. Artif. Intell. 2023, 126, 106864. [Google Scholar] [CrossRef]
- Tang, H.; Huang, J.; Ren, C.; Shao, Y.; Lu, J. Integrated scheduling of multi-objective lot-streaming hybrid flowshop with AGV based on deep reinforcement learning. Int. J. Prod. Res. 2024, 63, 1275–1303. [Google Scholar] [CrossRef]
- Deliktaş, D.; Özcan, E.; Ustun, O.; Torkul, O. Evolutionary algorithms for multi-objective flexible job shop cell scheduling. Appl. Soft Comput. 2021, 113, 107890. [Google Scholar] [CrossRef]
- Lei, D.; Li, M.; Wang, L. A Two-Phase Meta-Heuristic for Multiobjective Flexible Job Shop Scheduling Problem With Total Energy Consumption Threshold. IEEE Trans. Cybern. 2019, 49, 1097–1109. [Google Scholar] [CrossRef] [PubMed]
- Luan, F.; Zhao, H.; Liu, S.; He, Y.; Tang, B. Enhanced NSGA-II for multi-objective energy-saving flexible job shop scheduling. Sustain. Comput. Inform. Syst. 2023, 39, 100901. [Google Scholar] [CrossRef]
- Ojsteršek, R.; Tang, M.; Buchmeister, B. Due date optimization in multi-objective scheduling of flexible job shop production. Adv. Prod. Eng. Manag. 2020, 15, 481–492. [Google Scholar] [CrossRef]
- Wu, Z.; Fan, H.; Sun, Y.; Peng, M. Efficient Multi-Objective Optimization on Dynamic Flexible Job Shop Scheduling Using Deep Reinforcement Learning Approach. Processes 2023, 11, 2018. [Google Scholar] [CrossRef]
- Zhao, L.; Fan, J.; Zhang, C.; Shen, W.; Jing, Z. A DRL-Based Reactive Scheduling Policy for Flexible Job Shops With Random Job Arrivals. IEEE Trans. Autom. Sci. Eng. 2024, 21, 2912–2923. [Google Scholar] [CrossRef]
- Chen, Y.; Liao, X.; Chen, G.; Hou, Y. Dynamic Intelligent Scheduling in Low-Carbon Heterogeneous Distributed Flexible Job Shops with Job Insertions and Transfers. Sensors 2024, 24, 2251. [Google Scholar] [CrossRef]
- Wang, Z.; He, M.; Wu, J.; Chen, H.; Cao, Y. An improved MOEA/D for low-carbon many-objective flexible job shop scheduling problem. Comput. Ind. Eng. 2024, 188, 109926. [Google Scholar] [CrossRef]
- Piroozfard, H.; Wong, K.Y.; Wong, W.P. Minimizing total carbon footprint and total late work criterion in flexible job shop scheduling by using an improved multi-objective genetic algorithm. Resour. Conserv. Recycl. 2016, 128, 267–283. [Google Scholar] [CrossRef]
- Wei, Z.; Liao, W.; Zhang, L. Hybrid energy-efficient scheduling measures for flexible job-shop problem with variable machining speeds. Expert Syst. Appl. 2022, 197, 116785. [Google Scholar] [CrossRef]
- Zhang, F.; Li, R.; Gong, W. Deep reinforcement learning-based memetic algorithm for energy-aware flexible job shop scheduling with multi-AGV. Comput. Ind. Eng. 2024, 189, 109917. [Google Scholar] [CrossRef]
- Meng, L.; Zhang, C.; Ren, Y.; Zhang, B.; Lv, C. Mixed-integer linear programming and constraint programming formulations for solving distributed flexible job shop scheduling problem. Comput. Ind. Eng. 2020, 142, 106347. [Google Scholar] [CrossRef]
- Ji, B.; Zhang, S.; Yu, S.; Zhang, B. Mathematical Modeling and a Novel Heuristic Method for Flexible Job-Shop Batch Scheduling Problem with Incompatible Jobs. Sustainability 2023, 15, 1954. [Google Scholar] [CrossRef]
- Fan, J.; Zhang, C.; Shen, W.; Gao, L. A matheuristic for flexible job shop scheduling problem with lot-streaming and machine reconfigurations. Int. J. Prod. Res. 2022, 61, 6565–6588. [Google Scholar] [CrossRef]
- Mei, Z.; Lu, Y.; Lv, L. Research on Multi-Objective Low-Carbon Flexible Job Shop Scheduling Based on Improved NSGA-II. Machines 2024, 12, 590. [Google Scholar] [CrossRef]
- Sang, Y.; Tan, J. Many-Objective Flexible Job Shop Scheduling Problem with Green Consideration. Energies 2022, 15, 1884. [Google Scholar] [CrossRef]
- Li, R.; Gong, W.; Wang, L.; Lu, C.; Jiang, S. Two-stage knowledge-driven evolutionary algorithm for distributed green flexible job shop scheduling with type-2 fuzzy processing time. Swarm Evol. Comput. 2022, 74, 101139. [Google Scholar] [CrossRef]
- Lei, D.; Zheng, Y.; Guo, X. A shuffled frog-leaping algorithm for flexible job shop scheduling with the consideration of energy consumption. Int. J. Prod. Res. 2016, 55, 3126–3140. [Google Scholar] [CrossRef]
- Jiang, T.; Zhu, H.; Deng, G. Improved African buffalo optimization algorithm for the green flexible job shop scheduling problem considering energy consumption. J. Intell. Fuzzy Syst. 2020, 38, 4573–4589. [Google Scholar] [CrossRef]
- Peng, Z.; Zhang, H.; Tang, H.; Feng, Y.; Yin, W. Research on flexible job-shop scheduling problem in green sustainable manufacturing based on learning effect. J. Intell. Manuf. 2021, 33, 1725–1746. [Google Scholar] [CrossRef]
- Ren, W.; Wen, J.; Yan, Y.; Hu, Y.; Guan, Y.; Li, J. Multi-objective optimisation for energy-aware flexible job-shop scheduling problem with assembly operations. Int. J. Prod. Res. 2020, 59, 7216–7231. [Google Scholar] [CrossRef]
- Tan, W.; Yuan, X.; Huang, G.; Liu, Z. Low-carbon joint scheduling in flexible open-shop environment with constrained automatic guided vehicle by multi-objective particle swarm optimization. Appl. Soft Comput. 2021, 111, 107695. [Google Scholar] [CrossRef]
- Yang, X.; Zhang, J.; Zhang, N.; Li, Y. Low Carbon Multi-Objective Shop Scheduling Based On Genetic and Variable Neighborhood Algorithm. J. Phys. Conf. Ser. 2020, 1574, 012155. [Google Scholar] [CrossRef]
- Hayat, I.; Tariq, A.; Shahzad, W.; Masud, M.; Ahmed, S.; Ali, M.; Zafar, A. Hybridization of Particle Swarm Optimization with Variable Neighborhood Search and Simulated Annealing for Improved Handling of the Permutation Flow-Shop Scheduling Problem. Systems 2023, 11, 221. [Google Scholar] [CrossRef]
- Wang, L.; Pan, Z.; Wang, J. A Review of Reinforcement Learning Based Intelligent Optimization for Manufacturing Scheduling. Complex Syst. Model. Simul. 2021, 1, 257–270. [Google Scholar] [CrossRef]
- Khadivi, M.; Charter, T.; Yaghoubi, M.; Jalayer, M.; Ahang, M.; Shojaeinasab, A.; Najjaran, H. Deep reinforcement learning for machine scheduling: Methodology, the state-of-the-art, and future directions. arXiv 2023, arXiv:2310.03195. [Google Scholar] [CrossRef]
- Liu, R.; Piplani, R.; Toro, C. Deep reinforcement learning for dynamic scheduling of a flexible job shop. Int. J. Prod. Res. 2022, 60, 4049–4069. [Google Scholar] [CrossRef]
- Yi, W.; Chen, N.; Chen, Y.; Pei, Z. An improved deep Q-network for dynamic flexible job shop scheduling with limited maintenance resources. Int. J. Prod. Res. 2025, 63, 9112–9133. [Google Scholar] [CrossRef]
- Song, W.; Chen, X.; Li, Q.; Cao, Z. Flexible Job-Shop Scheduling via Graph Neural Network and Deep Reinforcement Learning. IEEE Trans. Ind. Inform. 2023, 19, 1600–1610. [Google Scholar] [CrossRef]
- Huang, J.-P.; Gao, L.; Li, X. An end-to-end deep reinforcement learning method based on graph neural network for distributed job-shop scheduling problem. Expert Syst. Appl. 2023, 238, 121756. [Google Scholar] [CrossRef]
- Wang, S.; Li, J.; Tang, H.; Wang, J. CEA-FJSP: Carbon emission-aware flexible job-shop scheduling based on deep reinforcement learning. Front. Environ. Sci. 2022, 10, 1059451. [Google Scholar] [CrossRef]
- van Hezewijk, L.; Dellaert, N.; Van Woensel, T.; Gademann, N. Using the proximal policy optimisation algorithm for solving the stochastic capacitated lot sizing problem. Int. J. Prod. Res. 2022, 61, 1955–1978. [Google Scholar] [CrossRef]
- Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal Policy Optimization Algorithms. arXiv 2017, arXiv:1707.06347. [Google Scholar] [CrossRef]
- Park, J.; Chun, J.; Kim, S.; Kim, Y.; Park, J. Learning to schedule job-shop problems: Representation and policy learning using graph neural network and reinforcement learning. Int. J. Prod. Res. 2021, 59, 3360–3377. [Google Scholar] [CrossRef]
- Lei, K.; Guo, P.; Zhao, W.; Wang, Y.; Qian, L.; Meng, X.; Tang, L. A multi-action deep reinforcement learning framework for flexible Job-shop scheduling problem. Expert Syst. Appl. 2022, 205, 117796. [Google Scholar] [CrossRef]
- Tang, H.; Dong, J. Solving Flexible Job-Shop Scheduling Problem with Heterogeneous Graph Neural Network Based on Relation and Deep Reinforcement Learning. Machines 2024, 12, 584. [Google Scholar] [CrossRef]
- Ding, L.; Guan, Z.; Rauf, M.; Yue, L. Multi-policy deep reinforcement learning for multi-objective multiplicity flexible job shop scheduling. Swarm Evol. Comput. 2024, 87, 101550. [Google Scholar] [CrossRef]
- Luo, S.; Zhang, L.; Fan, Y. Real-Time Scheduling for Dynamic Partial-No-Wait Multiobjective Flexible Job Shop by Deep Reinforcement Learning. IEEE Trans. Autom. Sci. Eng. 2022, 19, 3020–3038. [Google Scholar] [CrossRef]
- Liu, X.; Han, L.; Kang, L.; Liu, J.; Miao, H. Preference learning based deep reinforcement learning for flexible job shop scheduling problem. Complex Intell. Syst. 2025, 11, 144. [Google Scholar] [CrossRef]
- Pan, Z.; Wang, L.; Wang, J.-J.; Lu, J. Deep Reinforcement Learning Based Optimization Algorithm for Permutation Flow-Shop Scheduling. IEEE Trans. Emerg. Top. Comput. Intell. 2023, 7, 983–994. [Google Scholar] [CrossRef]
- Pan, Z.; Wang, L.; Dong, C.; Chen, J. A Knowledge-Guided End-to-End Optimization Framework Based on Reinforcement Learning for Flow Shop Scheduling. IEEE Trans. Ind. Inform. 2024, 20, 1853–1861. [Google Scholar] [CrossRef]
- Johnn, S.; Darvariu, V.; Handl, J.; Kalcsics, J. A Graph Reinforcement Learning Framework for Neural Adaptive Large Neighbourhood Search. Comput. Oper. Res. 2024, 172, 106791. [Google Scholar] [CrossRef]
- Oren, J.; Ross, C.; Lefarov, M.; Richter, F.; Taitler, A.; Feldman, Z.; Daniel, C.; Di Castro, D. SOLO: Search Online, Learn Offline for Combinatorial Optimization Problems. Proc. Int. Symp. Comb. Search 2021, 12, 97–105. [Google Scholar] [CrossRef]
- Shi, J.; Liu, W.; Yang, J. An Enhanced Multi-Objective Evolutionary Algorithm with Reinforcement Learning for Energy-Efficient Scheduling in the Flexible Job Shop. Processes 2024, 12, 1976. [Google Scholar] [CrossRef]
- Yao, Y.; Li, X.; Gao, L. A DQN-based memetic algorithm for energy-efficient job shop scheduling problem with integrated limited AGVs. Swarm Evol. Comput. 2024, 87, 101544. [Google Scholar] [CrossRef]
- Li, R.; Gong, W.; Lu, C. A reinforcement learning based RMOEA/D for bi-objective fuzzy flexible job shop scheduling. Expert Syst. Appl. 2022, 203, 117380. [Google Scholar] [CrossRef]
- Mora, J.; Escamilla-Serna, N.; Marín, J.; Hernández-Romero, N.; Barragán-Vite, I.; Corona-Armenta, J. A global-local neighborhood search algorithm and tabu search for flexible job shop scheduling problem. PeerJ Comput. Sci. 2020, 7, e574. [Google Scholar] [CrossRef]
- Xie, J.; Li, X.; Gao, L.; Gui, L. A hybrid genetic tabu search algorithm for distributed flexible job shop scheduling problems. J. Manuf. Syst. 2023, 71, 82–94. [Google Scholar] [CrossRef]
- Xie, J.; Teng, Y.; Gao, L.; Li, X.; Zhang, C. An efficient and stable intelligent scheduling algorithm based on hybrid neighbourhood structure for flexible job shop scheduling problem benchmarks. Int. J. Prod. Res. 2025, 63, 7921–7935. [Google Scholar] [CrossRef]
- Birgin, E.; Riveaux, J.; Ronconi, D. Energy-aware flexible job shop scheduling problem with nonlinear routes and position-based learning effect. Int. Trans. Oper. Res. 2025, 33, 860–891. [Google Scholar] [CrossRef]
- Røpke, S.; Pisinger, D. An Adaptive Large Neighborhood Search Heuristic for the Pickup and Delivery Problem with Time Windows. Transp. Sci. 2006, 40, 455–472. [Google Scholar] [CrossRef]
- Rifai, A.; Nguyen, H.; Dawal, S. Multi-objective adaptive large neighborhood search for distributed reentrant permutation flow shop scheduling. Appl. Soft Comput. 2016, 40, 42–57. [Google Scholar] [CrossRef]
- Cota, L.; Guimarães, F.; Ribeiro, R.; Meneghini, I.; Oliveira, F.; Souza, M.; Siarry, P. An adaptive multi-objective algorithm based on decomposition and large neighborhood search for a green machine scheduling problem. Swarm Evol. Comput. 2019, 51, 100601. [Google Scholar] [CrossRef]
- Liu, J.; Sun, B.; Li, G.; Chen, Y. Multi-objective adaptive large neighbourhood search algorithm for dynamic flexible job shop schedule problem with transportation resource. Eng. Appl. Artif. Intell. 2024, 132, 107917. [Google Scholar] [CrossRef]
- Cao, S.; Li, R.; Gong, W.; Lu, C. Inverse model and adaptive neighborhood search based cooperative optimizer for energy-efficient distributed flexible job shop scheduling. Swarm Evol. Comput. 2023, 83, 101419. [Google Scholar] [CrossRef]
- Hariri, F.; Santosa, B. A Hybrid Genetic Algorithm and Adaptive Large Neighborhood Search for Flexible Job Shop Scheduling with Fuzzy Processing Time. In 2025 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM); IEEE: Piscataway, NJ, USA, 2025; pp. 86–90. [Google Scholar] [CrossRef]
- Chung, K.; Lee, C.; Tsang, Y. Neural combinatorial optimization with reinforcement learning in industrial engineering: A survey. Artif. Intell. Rev. 2025, 58, 130. [Google Scholar] [CrossRef]
- Raffin, A.; Hill, A.; Gleave, A.; Kanervisto, A.; Ernestus, M.; Dormann, N. Stable-Baselines3: Reliable Reinforcement Learning Implementations. J. Mach. Learn. Res. 2021, 22, 1–8. Available online: http://jmlr.org/papers/v22/20-1364.html (accessed on 11 May 2026).
- Chen, Z.; Zhang, K.; Liu, P.; Xin, G.; Sun, Z.; Tao, Z.; Zhang, Y.; Ji, W.; Lu, Y.; Jia, L.; et al. Worst-Case Soft Actor-Critic-Based Safe Reinforcement Learning Method for Nonlinear Constrained Waterflood Reservoir Production Optimization. SPE J. 2025, 30, 7745–7766. [Google Scholar] [CrossRef]
- Liang, Y.; Sun, Y.; Zheng, R.; Huang, F. Efficient adversarial training without attacking: Worst-case-aware robust reinforcement learning. arXiv 2022, arXiv:2210.05927. [Google Scholar]
- Liu, C.; Chang, C.; Tseng, C. Actor-Critic Deep Reinforcement Learning for Solving Job Shop Scheduling Problems. IEEE Access 2020, 8, 71752–71762. [Google Scholar] [CrossRef]
- Liu, C.; Huang, T. Dynamic Job-Shop Scheduling Problems Using Graph Neural Network and Deep Reinforcement Learning. IEEE Trans. Syst. Man. Cybern. Syst. 2023, 53, 6836–6848. [Google Scholar] [CrossRef]
- Ruiz, J.; Mula, J.; Escoto, R. Job shop smart manufacturing scheduling by deep reinforcement learning. J. Ind. Inf. Integr. 2024, 38, 100582. [Google Scholar] [CrossRef]
- Wang, R.; Jing, Y.; Gu, C.; He, S.; Chen, J. End-to-End Multitarget Flexible Job Shop Scheduling With Deep Reinforcement Learning. IEEE Internet Things J. 2025, 12, 4420–4434. [Google Scholar] [CrossRef]
- Zhang, L.; Feng, Y.; Xiao, Q.; Xu, Y.; Li, D.; Yang, D.; Yang, Z. Deep reinforcement learning for dynamic flexible job shop scheduling problem considering variable processing times. J. Manuf. Syst. 2023, 71, 257–273. [Google Scholar] [CrossRef]
- Zhou, Y.; Jiang, J.; Shi, Q.; Fu, M.; Zhang, Y.; Chen, Y.; Zhou, L. GA-HPO PPO: A Hybrid Algorithm for Dynamic Flexible Job Shop Scheduling. Sensors 2025, 25, 6736. [Google Scholar] [CrossRef]
- Chen, Y.; Zhang, F.; Liu, Z. Adaptive Advantage Estimation for Actor-Critic Algorithms. In 2021 International Joint Conference on Neural Networks (IJCNN); IEEE: Piscataway, NJ, USA, 2021; pp. 1–8. [Google Scholar] [CrossRef]
- Chen, Y.; Zhang, F.; Liu, Z. Adaptive bias-variance trade-off in advantage estimator for actor-critic algorithms. Neural Netw. 2023, 169, 764–777. [Google Scholar] [CrossRef]
- Li, Y.; Yu, C. Flexible Job Shop Scheduling with Job Precedence Constraints: A Deep Reinforcement Learning Approach. J. Manuf. Mater. Process. 2025, 9, 216. [Google Scholar] [CrossRef]
- Wang, Z.; Liao, W. Smart scheduling of dynamic job shop based on discrete event simulation and deep reinforcement learning. J. Intell. Manuf. 2023, 35, 2593–2610. [Google Scholar] [CrossRef]
- Zhang, Y.; Zhu, H.; Tang, D.; Zhou, T.; Gui, Y. Dynamic job shop scheduling based on deep reinforcement learning for multi-agent manufacturing systems. Robot. Comput.-Integr. Manuf. 2022, 78, 102412. [Google Scholar] [CrossRef]
- Yu, C.; Velu, A.; Vinitsky, E.; Wang, Y.; Bayen, A.; Wu, Y. The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games. Adv. Neural Inf. Process. Syst. 2021, 35, 24611–24624. [Google Scholar] [CrossRef]
- Behnke, D.; Geiger, M.J. Test Instances for the Flexible Job Shop Scheduling Problem with Work Centers; Research Paper; Helmut-Schmidt-Universität, Lehrstuhl für Betriebswirtschaftslehre, insbes; Logistik-Management: Hamburg, Germany, 2012. [Google Scholar]
- Lu, Y.; Zhu, Q.; Tian, C.; He, E.; Zhang, T. Low-Carbon and Energy-Efficient Dynamic Flexible Job Shop Scheduling Method Towards Renewable Energy Driven Manufacturing. Machines 2026, 14, 88. [Google Scholar] [CrossRef]
- Cinar, D.; Topcu, Y.I.; Oliveira, J.A. A priority-based genetic algorithm for a flexible job shop scheduling problem. J. Ind. Manag. Optim. 2016, 12, 1391. [Google Scholar] [CrossRef]
- Deb, K.; Agrawal, R.B. Simulated binary crossover for continuous search space. Complex Syst. 1995, 9, 115–148. Available online: http://www.complex-systems.com/abstracts/v09_i02_a02/ (accessed on 11 May 2026).
- Singh, S.S.; Joshi, R.; Gupta, D. An Advantage Actor-Critic Approach for Energy-Conscious Scheduling in Flexible Job Shops. J. Artif. Intell. 2025, 7, 177–203. [Google Scholar] [CrossRef]
- Singh, S.S.; Gupta, D.P. A Soft Actor-Critic Approach for Energy-Conscious Flexible Job Shop Scheduling Incorporating Machine Usage Constraints and Job Release Times. J. Manag. Eng. Integr. 2025, 18, 116–125. [Google Scholar] [CrossRef]





| Problem Class | Representative Studies | Main Objectives Commonly Considered | Contribution to the Literature | Scope Relative to CAFJSP-T | CAFJSP-T Extension |
|---|---|---|---|---|---|
| Classical FJSP | [21,22,23] | Makespan, workload, utilization, cost, and tardiness | Establishes the assignment-sequencing structure of flexible job shop scheduling | Environmental objectives are usually not central | Adds explicit carbon-emission evaluation to the flexible assignment-sequencing structure |
| Tardiness-aware FJSP | [9,32,33] | Tardiness, delay, due-date performance, and makespan | Emphasizes delivery reliability and due-date-oriented service performance | Carbon emissions or energy effects are usually not primary | Retains tardiness penalty as a primary operational objective while adding carbon awareness |
| Green or low-carbon FJSP | [13,27,28] | Carbon emissions, energy, cost, makespan, and customer satisfaction | Incorporates environmental performance into flexible shop scheduling | Due-date performance is often indirect or absent | Makes carbon emissions the primary environmental objective and pairs it directly with tardiness penalty |
| Energy-aware FJSP | [29,34,41] | Energy consumption, energy cost, power, makespan, and delay | Shows how machine states, pricing, and energy use affect scheduling | Energy is often used as the environmental measure rather than direct carbon emissions | Treats energy as a supporting indicator while optimizing carbon emissions directly |
| Integrated green FJSP with transport or AGVs | [30,31,42] | Energy, carbon, makespan, transportation, and AGV coordination | Extends scheduling to integrated production and material-handling decisions | Often focuses on energy, makespan, or transport rather than carbon-tardiness penalty | Provides a basis for future extensions, while CAFJSP-T focuses on the core carbon-tardiness trade-off |
| Carbon or energy plus tardiness scheduling | [38,39,40] | Energy, carbon, delay, tardiness, and makespan | Shows that environmental and due-date measures can be jointly optimized | Often uses energy rather than carbon, non-FJSP settings, or many-objective formulations | Defines a focused FJSP formulation with carbon emissions and tardiness penalty as the primary objectives |
| Symbol | Description |
|---|---|
| Sets and Indices | |
| Set of jobs | |
| Set of machines | |
| Operations of job | |
| Set of all operations, | |
| Eligible machines for operation | |
| Parameters | |
| Processing time of operation on machine , in minutes | |
| Power consumption of machine m during processing, in kW | |
| Carbon intensity of machine m, in kg CO2/kWh | |
| Due date for job j | |
| Tardiness penalty rate | |
| Baseline carbon value for instance category c | |
| Baseline tardiness penalty value for instance category c | |
| Scalarization weights, where | |
| B | Sufficiently large positive constant |
| Derived Quantities | |
| Carbon emissions, , in kg CO2 | |
| Decision Variables | |
| 1 if operation is assigned to machine m | |
| Start time of operation | |
| Completion time of job j | |
| Tardiness of job j | |
| Makespan | |
| Sequencing variable for two operations sharing machine m | |
| Category | Representative Instance | Carbon Baseline (kg CO2) | Tardiness Penalty Baseline |
|---|---|---|---|
| Sm | sm04_5 | 1869.3974 | 4425.480 |
| Med | med04_5 | 2005.0144 | 2878.902 |
| Lar | lar04_5 | 1820.9377 | 2651.852 |
| Instance | Carbon (kg CO2) | Tardiness Penalty | Energy (kWh) | Makespan (Minutes) | CPU (s) | Optimality Gap (%) |
|---|---|---|---|---|---|---|
| sm01_1 | 140.70 | 0.00 | 140.98 | 157.00 | 0.81 | 2.87 |
| sm01_3 | 142.32 | 0.00 | 142.61 | 159.00 | 0.80 | 2.92 |
| sm02_2 | 297.30 | 17.26 | 297.90 | 217.00 | 1.41 | 4.23 |
| sm03_1 | 739.73 | 478.14 | 741.21 | 428.00 | 3.62 | 7.56 |
| sm04_5 | 1499.46 | 3093.06 | 1502.46 | 864.00 | 8.55 | 11.34 |
| med01_2 | 145.73 | 0.00 | 146.03 | 148.00 | 0.52 | 2.18 |
| med02_1 | 297.84 | 2.07 | 298.44 | 160.00 | 1.59 | 4.67 |
| med02_5 | 302.71 | 5.10 | 303.32 | 173.00 | 1.67 | 5.89 |
| med03_3 | 773.86 | 246.11 | 775.41 | 286.00 | 4.43 | 8.92 |
| med04_5 | 1580.71 | 1685.21 | 1583.88 | 589.00 | 10.51 | 12.78 |
| lar01_1 | 142.18 | 0.00 | 142.47 | 122.00 | 1.00 | 2.43 |
| lar02_3 | 272.26 | 0.07 | 272.81 | 177.00 | 1.33 | 6.12 |
| lar03_2 | 691.81 | 98.43 | 693.19 | 283.00 | 5.57 | 9.45 |
| lar04_1 | 1387.07 | 1170.23 | 1389.85 | 506.00 | 9.39 | 13.21 |
| lar04_5 | 1438.07 | 1180.54 | 1440.95 | 503.00 | 9.95 | 13.67 |
| Weight | Carbon Emissions (kg CO2) | Tardiness Penalty | Energy Consumption (kWh) | Makespan (Minutes) | CPU (s) | Optimality Gap (%) |
|---|---|---|---|---|---|---|
| 278.3329 | 44.82 | 278.8907 | 262 | 1.07 | 4.87 | |
| 281.3272 | 33.76 | 281.8910 | 233 | 1.07 | 4.52 | |
| 280.4681 | 45.38 | 281.0302 | 256 | 1.10 | 4.91 | |
| 287.2771 | 9.04 | 287.8528 | 196 | 1.77 | 4.23 | |
| 289.3713 | 10.24 | 289.9512 | 208 | 1.17 | 4.35 | |
| 280.1544 | 13.04 | 280.7158 | 206 | 1.03 | 4.41 | |
| 286.8900 | 7.88 | 287.4600 | 183 | 1.04 | 4.19 |
| Instance | Weights | Metric | Pro-LNS | PPO-Only | A2C | SAC | GA |
|---|---|---|---|---|---|---|---|
| sm04_5 | Carbon (kg CO2) | 1475.28 | 1494.61 | 1510.82 | 1521.37 | 1507.19 | |
| Tardiness Penalty | 3214.78 | 3507.33 | 3702.64 | 3789.41 | 3855.72 | ||
| CPU (s) | 8.59 | 4.81 | 0.61 | 0.15 | 21.07 | ||
| Carbon (kg CO2) | 1499.46 | 1519.71 | 1533.18 | 1545.62 | 1531.44 | ||
| Tardiness Penalty | 3093.06 | 3412.89 | 3528.15 | 3627.83 | 3705.19 | ||
| CPU (s) | 8.55 | 4.83 | 0.78 | 0.17 | 26.07 | ||
| Carbon (kg CO2) | 1508.73 | 1528.94 | 1541.57 | 1554.91 | 1539.82 | ||
| Tardiness Penalty | 3002.51 | 3287.74 | 3414.88 | 3499.43 | 3591.02 | ||
| CPU (s) | 8.54 | 4.87 | 0.91 | 0.20 | 27.87 | ||
| med04_5 | Carbon (kg CO2) | 1556.47 | 1577.16 | 1595.43 | 1608.72 | 1591.86 | |
| Tardiness Penalty | 1812.45 | 1993.69 | 2084.32 | 2139.57 | 2216.38 | ||
| CPU (s) | 10.56 | 5.84 | 0.62 | 0.14 | 25.19 | ||
| Carbon (kg CO2) | 1580.71 | 1605.38 | 1620.65 | 1636.11 | 1617.93 | ||
| Tardiness Penalty | 1685.21 | 1873.58 | 1954.84 | 2012.66 | 2084.19 | ||
| CPU (s) | 10.54 | 5.88 | 0.89 | 0.17 | 28.14 | ||
| Carbon (kg CO2) | 1591.55 | 1613.17 | 1630.82 | 1647.30 | 1626.74 | ||
| Tardiness Penalty | 1634.77 | 1802.46 | 1893.12 | 1942.58 | 2011.83 | ||
| CPU (s) | 10.55 | 5.93 | 0.93 | 0.18 | 29.59 | ||
| lar04_5 | Carbon (kg CO2) | 1412.54 | 1432.46 | 1451.93 | 1466.38 | 1447.85 | |
| Tardiness Penalty | 1298.63 | 1432.58 | 1496.82 | 1540.29 | 1591.46 | ||
| CPU (s) | 9.95 | 5.30 | 0.64 | 0.19 | 23.03 | ||
| Carbon (kg CO2) | 1438.07 | 1461.41 | 1477.63 | 1494.50 | 1473.95 | ||
| Tardiness Penalty | 1180.54 | 1317.46 | 1371.63 | 1413.08 | 1462.71 | ||
| CPU (s) | 9.98 | 5.41 | 0.66 | 0.16 | 29.93 | ||
| Carbon (kg CO2) | 1449.85 | 1469.80 | 1488.15 | 1505.74 | 1482.60 | ||
| Tardiness Penalty | 1138.46 | 1259.81 | 1319.35 | 1354.92 | 1401.28 | ||
| CPU (s) | 9.91 | 5.62 | 0.88 | 0.16 | 29.03 |
| Metric | df | p-Value | Kendall’s W | Pro-LNS | PPO-Only | A2C | SAC | GA | |
|---|---|---|---|---|---|---|---|---|---|
| Weighted scalarized objective | 34.40 | 4 | 0.956 | 1.00 | 2.00 | 3.00 | 4.33 | 4.67 | |
| Carbon emissions | 36.00 | 4 | 1.000 | 1.00 | 2.00 | 3.67 | 5.00 | 3.33 | |
| Tardiness penalty | 36.00 | 4 | 1.000 | 1.00 | 2.00 | 3.00 | 4.00 | 5.00 |
| Comparison | Wilcoxon Statistic | Raw p | Holm-Adjusted p | Objective Improvement (%) | Carbon Reduction (%) | Tardiness Reduction (%) |
|---|---|---|---|---|---|---|
| Pro-LNS vs. PPO-only | 0.00 | 0.0039 | 0.0156 | 4.90 | 1.39 | 9.19 |
| Pro-LNS vs. A2C | 0.00 | 0.0039 | 0.0156 | 7.25 | 2.44 | 13.03 |
| Pro-LNS vs. SAC | 0.00 | 0.0039 | 0.0156 | 8.81 | 3.35 | 15.29 |
| Pro-LNS vs. GA | 0.00 | 0.0039 | 0.0156 | 9.51 | 2.22 | 17.61 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Singh, S.S.; Gupta, D. A Policy-Based Rough Optimization with Large Neighborhood Search for Carbon-Aware Flexible Job Shop Scheduling with Tardiness Penalty. Computers 2026, 15, 314. https://doi.org/10.3390/computers15050314
Singh SS, Gupta D. A Policy-Based Rough Optimization with Large Neighborhood Search for Carbon-Aware Flexible Job Shop Scheduling with Tardiness Penalty. Computers. 2026; 15(5):314. https://doi.org/10.3390/computers15050314
Chicago/Turabian StyleSingh, Saurabh Sanjay, and Deepak Gupta. 2026. "A Policy-Based Rough Optimization with Large Neighborhood Search for Carbon-Aware Flexible Job Shop Scheduling with Tardiness Penalty" Computers 15, no. 5: 314. https://doi.org/10.3390/computers15050314
APA StyleSingh, S. S., & Gupta, D. (2026). A Policy-Based Rough Optimization with Large Neighborhood Search for Carbon-Aware Flexible Job Shop Scheduling with Tardiness Penalty. Computers, 15(5), 314. https://doi.org/10.3390/computers15050314

