Next Article in Journal
A TransUNet-Based Intelligent Method for Identifying Internal Solitary Waves in the South China Sea
Next Article in Special Issue
An Assessment of Shipping Network Resilience Under the Epidemic Transmission Using a SEIR Model
Previous Article in Journal
Performance Prediction of Bow-Foil Thrusters in Waves Using Unsteady Vortex Element Method
Previous Article in Special Issue
Deploying Liquefied Natural Gas-Powered Ships in Response to the Maritime Emission Trading System: From the Perspective of Shipping Alliances
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Energy-Efficient Scheduling for Resilient Container-Supply Hybrid Flow Shops Under Transportation Constraints and Stochastic Arrivals

1
Logistics Engineering College, Shanghai Maritime University, Pudong, Shanghai 201306, China
2
Business School, Shanghai Dianji University, Pudong, Shanghai 201306, China
3
China Institute of FTZ Supply Chain, Shanghai Maritime University, Pudong, Shanghai 201306, China
*
Author to whom correspondence should be addressed.
J. Mar. Sci. Eng. 2025, 13(6), 1153; https://doi.org/10.3390/jmse13061153
Submission received: 6 May 2025 / Revised: 3 June 2025 / Accepted: 9 June 2025 / Published: 11 June 2025

Abstract

Although dynamic, energy-efficient container-supply hybrid flow shops have attracted increasing attention, most existing research overlooks how transportation within container production affects makespan, resilience, and sustainability. To bridge this gap, we frame a resilient, energy-efficient container-supply hybrid flow shop (TDEHFSP) scheduling model that utilizes vehicle transportation to maximize operational efficiency. To address the TDEHFSP model, the study proposes a Q-learning-based multi-swarm collaborative optimization algorithm (Q-MGCOA). The algorithm integrates a time gap left-shift scheduling strategy with a machine on–off control mechanism to construct an energy-saving optimization framework. Additionally, a predictive–reactive dynamic rescheduling model is introduced to address unexpected task disturbances. To validate the algorithm’s effectiveness, 36 benchmark test cases with varying scales are designed for horizontal comparison. Results show that the proposed Q-MGCOA outperforms benchmarks on convergence, diversity, and supply-chain resilience while lowering energy utilization. Moreover, it achieves about an 8% reduction in energy consumption compared to traditional algorithms. These findings reveal actionable insights for next-generation intelligent, low-carbon container production.

1. Introduction

The Hybrid Flow Shop Problem (HFSP) refers to a production workshop where multiple processes are arranged in a flow line, with at least one process involving multiple machines. The primary challenge is organizing the job sequencing and machine scheduling. HFSP involves multiple processing stages and is known for its flexibility. It is particularly suited for scheduling small batch, multi-variety production tasks, making it widely applicable in manufacturing. To tackle environment conservation and emission reduction concerns, the energy-optimization process of HFSP will be an important research focus.
For energy-efficient hybrid flow shop scheduling (EHFSP), most studies aim to optimize both the makespan   ( C m a x ) and the total energy consumption (TEC). Most studies on energy-efficient hybrid flow shop scheduling (EHFSP) focus on a single energy-efficient strategy, such as speed-scaling, turn-on/off mechanisms, or time-of-use pricing.
The speed-scaling strategy refers to the dynamic adjustment of processing speeds during processing. Wang et al. [1] used the speed-scaling strategy for machines and an extended non-dominated sorting genetic algorithm (NSGA-II) to reduce T E C without affecting work efficiency. Wang et al. [2] provided an energy-efficient strategy on account of speed selection that optimizes both T E C and C m a x , taking into account the characteristics of EHFSP. Goli et al. [3] introduced a new meta-heuristic algorithm for determining batch size and speed per machine to minimize   T E C and C m a x simultaneously.
Time-of-use pricing mechanisms aim to reduce T E C and carbon emission costs by optimizing electricity usage time. Geng et al. [4] researched the distributed heterogeneous re-entrant HFSP issue with the following dependent times under time-of-use pricing constraints and issued a multi-objective artificial bee colony algorithm to deal with the challenge. Ho et al. [5] proposed an exact method based on logical Benders decomposition to tackle the two-machine issue with time-of-use pricing to minimize TEC. Complementing energy-saving efforts inside container terminals, the ship side is also moving toward alternative fuels to comply with the forthcoming Maritime ETS. Sun et al. [6] analyze shipping alliances and show that deploying LNG-powered vessels can cut ETS allowance costs by roughly 12–18% while enhancing inter-alliance competitiveness. Their work reinforces the need to couple vessel-fuel decisions with landside scheduling models targeting CO2/TEU reduction, thereby justifying the sustainability objective adopted in this study.
In contrast to the speed-scaling strategy, there has not been that much attention paid to the turn-on/off mechanism. Mu et al. [7] considered the turn-on/off mechanism, established a model for the collaborative scheduling of heterogeneous multi-stage HFSP, and proposed a method combining the moth firefly optimization algorithm (NSGA-II-MFO). The study puts forward an energy-efficient strategy derived from the “left shift” of time intervals to minimize TEC, considering the previously discussed turn-on/off strategy.
The HFSP machining process affects the manufacturing plant’s productivity and the production plan’s implementation, so some scholars have studied the dynamic shop scheduling DHFSP. The most common dynamics are the emergence of recent work demands and equipment failure. Tao et al. [8] developed the NSGA-II with Q-learning to solve the disruption of the arrival of new work. Wang et al. [2] developed research on the DEHFSP scheduling issue of machine failures and proposed an improved multi-objective firefly algorithm that guarantees to reduce T E C without affecting productivity. Jiang et al. [9] put forward a multi-faceted immune system-inspired algorithm combining a Q-learning algorithm to optimize T E C , C m a x and average agreement index with three objectives by simultaneously considering the effects of the above two dynamics.
Dynamic scheduling can be categorized as fully responsive planning, anticipation-based adaptable planning, and resilient forward-looking planning.
Fully responsive scheduling deals with changing circumstances by continuously observing the state of the job shop and does not require a predefined scheduling scheme. Luo et al. [10] introduced a ubiquitous manufacturing environment with immediate data feedback channels, information exchange, and alignment mechanisms, and a multi-cycle hierarchical scheduling system to solve the HFSP scheduling issue with continuous new jobs. Ghaleb et al. [11] proposed an active–passive spontaneous scheduling model to deal with uncertain, unknown workloads and machine breakdown.
Robust, proactive scheduling refers to generating predetermined scheduling plans based on a pre-training process to ameliorate system instability caused by frequent rescheduling. Ashraf et al. [12] study the importance of leveraging technology and distance learning solutions to ensure continuity in education, emphasizing the need for innovative and sustainable approaches to teaching and learning in crisis situations. ElMekkawy et al. [13] exploited the machine’s flexibility and predicted equipment malfunctions, creating an anticipatory scheduling way to tackle the real-time scheduling challenges of a versatile job shop with potential machinery failures.
Unlike robust proactive scheduling, predictive–reactive scheduling adopts a pre-scheduling scheme in a static environment. Then, it modifies the pre-scheduling scheme to acquire a new scheduling arrangement according to the latest situation after a scenario emerges or at a fixed period. There are also three predictive–reactive scheduling methods: partial scheduling, complete rescheduling, and insertion rescheduling. Predictive–reactive scheduling approaches primarily include partial schedule adjustment, full rescheduling, and task insertion. Duan et al. [14] analyzed manufacturing system robustness under disruptions while balancing   C m a x , T E C , and resource reusability. A dynamic HFSP framework was introduced by Luo et al. [15], combining complete rescheduling strategies with a hybrid parallel genetic algorithm to manage new job arrivals. Ghaleb et al. [16] designed a genetic algorithm hybridizing predictive and reactive phases to optimize disruption recovery costs and tardiness penalties in complex scenarios. In this paper, the three predictive–responsive scheduling methods are applied simultaneously, and the most robust method is selected for rescheduling based on an instability measure.
With the increasing research on shop scheduling, some scholars began to explore the impact of the transportation process on shop scheduling problems. Li et al. [17] take the transportation time of AGVs into account and offer an innovative way to solve the synthesis, production, and transportation scheduling in the HFSP environment. Luo et al. [18] incorporated transport resource constraints into the Distributed Assembly Flow Shop Scheduling Problem (DAFSP), enhancing a decomposition-based multi-objective evolutionary algorithm (IMOEA/D) for C m a x   optimization. Cai et al. [19] investigated the distributed assembly HFSP scheduling issue, which consists of manufacturing and transportation, and offered a shuffled frog learning algorithm (QSFLA) with Q-learning to minimize the completion time. While much of the research in flow shop scheduling focuses on distributed flow shops, there is comparatively less research on transportation energy consumption. Zhang et al. [20] issue a decomposition-based three-stage multi-objective approach (TMOA/D) to solve the EHFSP model by combining three scenarios: the energy used by machinery during standby and preparation periods and work during transportation for the first time.
The emergence of various incidents and the transportation of jobs within the shop scheduling are unavoidable in real production processes. However, no research has been conducted to address both situations in a multi-stage HFSP configuration. The study examines an energy-efficient hybrid flow shop scheduling issue (TDEHFSP) that accounts for the untimely introduction of new jobs and the transportation process between processing stages, with the objective of simultaneously minimizing C m a x   and T E C .
For years, reinforcement learning techniques have gained a surge in interest in solving optimization problems. There are two main applications for shop scheduling. The first is to use reinforcement learning as an advanced hyper-heuristic to guide the finding of the set of low-level heuristics. Zhang et al. [21] suggested a Q-learning-based hyper-heuristic evolutionary algorithm (QLHHEA) to solve the DABFSP problem aiming at minimization. Zhang et al. [22] developed the Q-learning advanced strategy to find out the most appropriate low-level heuristic (LLH) from a pre-designed set based on the insights provided by the efficacy of the low-level heuristic. However, the hyper-heuristic algorithm is less suitable for dynamic scheduling research because it requires targeted design of LLHs and has high computational complexity.
The second one is reinforcement learning combined with meta-heuristic algorithms, which can enhance the search efficacy and performance of the algorithm. Jia et al. [23] developed a Q-learning-driven multi-swarm pattern mining algorithm (MPMA) to address distributed assembly HFSP with adaptive preventive maintenance. Li et al. [24] tackled energy-efficient HFSP by minimizing delays, energy costs, and carbon trading expenses through a dual Q-table reinforcement learning mechanism integrated with a generalized variable neighborhood search-enhanced NSGA-II. Zhang et al. [25] focused on the heterogeneous DEHFSP and simultaneously optimized   C m a x and T E C by utilizing Q-learning to guide the variable neighborhood search. Additionally, they proposed a multi-objective modal factorization algorithm combining particle swarm optimization PSO and Q-learning local search to improve the exploration capability of PSO.
These studies demonstrate that the algorithm combining Q-learning and PSO to drive the meta-heuristic process is expected to enhance exploration capability. MGPSO is a multiple swarm particle swarm optimization algorithm proposed based on PSO. Q-learning in this study is employed to learn the optimal behavior on the part of the operators, ensuring the selection of a suitable local search operator throughout the iterative procedure.
The specific value of this paper is threefold. First, we propose a novel energy-efficient hybrid flow shop scheduling problem (TDEHFSP) that simultaneously accounts for dynamic job arrivals and transportation processes, addressing the gap in optimizing both transportation energy consumption and dynamic disturbances in multi-stage HFSP. Second, we develop a multi-objective MINLP model integrating a time interval left-shift strategy and machine turn-on/off mechanisms, providing a quantitative tool for low-carbon production. Third, the proposed Q-MGCOA algorithm dynamically selects search operators via Q-learning, significantly improving convergence and diversity. This research not only reduces makespan and energy consumption but also offers new insights for real-time scheduling and cost control in port container supply chains.

2. Materials and Methods

2.1. Mathematical Modeling

2.1.1. Problem Formulation

This paper studies a multi-stage TDEHFSP. A framework diagram of a hybrid flow shop dynamic scheduling optimization system considering the transportation process is shown in Figure 1. Each machining stage includes one or more machining machines with the same function but varying efficiency. There is a buffer zone before machining stage 1, and all the jobs are stored in the buffer zone at the beginning for use in later processes. Additionally, the study encompasses the transportation of jobs between the various stages of the process, with corresponding transport vehicles established between the stages.
The TDEHFSP framework involves processing a collection of jobs across sequential phases, where each phase utilizes m dedicated machines. Each job consists of an ordered chain of operations and exhibits dynamic arrival patterns, meaning new jobs may emerge unpredictably after initial scheduling execution. Inter-phase material transfer is managed by a specialized transport unit that operates in three distinct modes: no-loading, loading, and idle waiting. Notably, energy consumption characteristics differ between loaded and unloaded transit modes due to varying operational speeds and power requirements, while the standby mode maintains zero energy expenditure.
The assumptions are as follows:
  • Each machine can process only one task at any given moment.
  • Every job must be processed by exactly one designated machine per operation.
  • Job processing must proceed continuously once initiated, without suspension or interruption.
  • All resources (machines/transport units) remain fully operational and available throughout the production timeline.
  • Each operation can be executed on any eligible machine from the available set.
  • Machines operate in four distinct modes: active processing, standby, maintenance, and shutdown, each with unique energy consumption patterns.
  • Transport speed varies between the loaded and unloaded states of the transfer unit.
  • Initial locations of all jobs and transport units reside in an unbounded-capacity buffer zone.
  • Transportation assignments are exclusive—each task requires dedicated transporter allocation, and each transporter handles only one task concurrently.
  • Potential conflicts or collision avoidance between transporters are excluded from consideration.
  • Transport distance computations employ Manhattan metric principles.
  • Existing schedules remain reconfigurable for unprocessed operations when new jobs arrive.
  • New job arrivals occur stochastically without predefined temporal patterns.

2.1.2. Modeling Building

In order to establish a multi-objective optimization framework for the TDEHFSP, which is suited for container-supply situations, targeting dual minimization of C m a x , and T E C , a model by previous studies of Zhang et al. [26] has been referred to, which is established as follows, which the notations are given in Table 1.
Minimize the defining equation:
m i n C m a x = m i n C i + i = 1 n m a x C i s
m i n T E C = min E 1 + E 2 + E 3 + E 4 + E 5
E 1 = k j = 1 m i = 1 n j = 1 s a i j k j · t i j k j · P k j
E 2 = k j = 1 m i = 0 , i * = 1 n j = 1 s b i * j k j · T k j o f f · P k j o f f + T k j o n · P k j o n ,   T i j i * j k j > T k j  
E 3 = k j = 1 m i = 1 , i * = 0 n j = 1 s b i * k j · T i j i * j k j · I P k j ,   T i j i * j k j < T k j
E 4 = i = 1 n [ j = 1 , j * = 2 s k j = 1 , k j * = 1 m h j = 1 q ( x k j * x k j + y k j * y k j v z l ) · e i k j k j * · c i j k j * h j * + k 1 = 1 m h 1 = 1 q x k 1 x b v z l · a i 1 k 1 · c i 1 k 1 h 1 ] · P Z l
E 5 = i = 1 , i * = 2 n [ j = 1 , j * = 2 s k j = 1 , k j * = 1 m h j = 1 q e = 1 , e * = 2 g ( x k j * x k j + y k j * y k j v z n ) · b i * k j · a i * j k j · a i j * k j * · e i * k j k j * · d i e i * e * h j * + k 1 = 1 m h 1 = 1 q e * = 3 , e = 2 g x b x k 1 v z n · a i 1 k 1 · d i * f i * f * h 1 ] · P z n  
Equations (3)–(5) quantify machine energy consumption during processing, turn-on/off, and standby states, where idle duration   T k j determines power cycling decisions. Equations (6) and (7) define transporter energy profiles for unloaded and loaded movements.
C i j + a i j k j · t i j k j + b i * k j · T i j i * j k j C i * j * + 1 b i * k j · L , i , i * = 1 , , n ; j , j * = 1 , , s ; k j = 1 , , m
C i ( j 1 ) + a i j k j · t i j k j + b i * k j · T i j i * j k j C i j + 1 e i k j k j * · L , i , i * = 1 , , n ; j , j * = 1 , , s ; k j , k j * = 1 , , m
N S T i * j * + 1 d i e i * e * h j * L E T i j , i , i * = 1 , , n ; j , j * = 1 , , s ; h j * = 1 , , q ; e , e * = 1 , , g
L S T i j max L E T i j , C i j 1 , i = 1 , , n ; j = 2 , , s
S i * j * m a x { m a x ( L E T i * j * , C i j + T i j i * j k j } , i , i * = 1 , , n ; j , j * = 1 , , s
k = 1 m a i j k j = 1 , i = 1 , , n ; j = 1 , , s
i = 1 n j = 1 s a i j k j = 1 , k j = 1 , , m
i * = 1 n b i * k j = 1 , k j = 1 , , m
h j = 1 q c i j k j h j = 1 , i = 1 , , n ; k j = 1 , , m
i = 1 n j = 1 s c i j k j h j = 1 , k j = 1 , , m ; h j = 1 , , q
T N _ i o n > T N _ i A N i = 1 , , r
Equations (8) and (9) enforce operation sequencing rules. Equation (10) mandates sequential task completion per transporter. Equation (11) ties machine progression to transporter arrival, while Equation (12) ensures task initiation only after material delivery and predecessor completion. Equation (13) binds each operation to one machine; Equation (14) prohibits machine multitasking. Equation (15) restricts non-initial operations to single predecessors per machine. Equations (16) and (17) enforce one-to-one task-transporter mapping with no concurrent assignments. Equation (18) governs new job insertion timing under arrival uncertainty.

2.2. Rescheduling Strategies

Unlike the research by Zhang et al. [27], this paper adopts three rescheduling methods combined with dynamic strategies to address the new job arrival problem. The descriptions of the three dynamic rescheduling methods are as follows:
  • Reassembling scheduling: When new jobs arrive, the original schedule is preserved, but the system dynamically selects the optimal machine based on real-time load balancing and future time-unit predictions.
  • Complete rescheduling: This method reorganizes all pending tasks and new jobs upon events rather than fixed intervals. It employs heuristics to dynamically adjust task assignments based on real-time device availability and delay risks, leveraging sliding window convolution for future gap detection.
  • Insertion rescheduling: By leveraging sliding window algorithms, the system identifies idle time slots on each machine. New tasks are inserted into the earliest feasible slot to reduce makespan.
The expression for the robustness metric S m i n of the rescheduling strategy is as follows [28]:
S m i n = i = 1 n j = 1 O i j T i , j i = 1 n O i j  
T i , j = 1 ,     i f   b o t h   O i j p r e   a n d   K j   o f   O i j   a r e   c h a n g e d   0.5 ,     i f   o n e   o f   t h e   O i j p r e   a n d   K j     c h a n g s 0 ,     i f   b o t h   O i j p r e   a n d   K j   o f   O i j   r e m a i n   s a m e    
where T i , j represents the instability coefficient, the value of   T i , j is subject to two variables, that is, the machining machine K j   of the operation   O i j   and the previous operation   O i j p r e on the same machine   K j .

2.3. Energy-Efficient Strategy

Based on the founded MINLP model and the attributes of TDEHFSP, the “left-shift” strategy and turn-on/off strategy are proposed to reduce TEC. For the “left-shift” strategy, during solution decoding, a temporal gap utilization strategy is implemented by analyzing the idle periods k j i d l e o f   m a c h i n e   k j assigned to operation O i j and its predecessor’s completion time T i ( j 1 ) . Candidate scheduling gaps I k j h , h 1 , N k j are dynamically generated, with operation O i j inserted into I k j h only if the condition I k j s h + t i j p r o c + t i j t r a n s     I k j e h is satisfied, where t i j p r o c and t i j t r a n s denote processing and transportation durations. This mechanism optimizes both T E C and C m a x by leveraging idle machine intervals.
The “left shift” approach decreases the machine’s idle waiting time, while the turn-on/off method plays a crucial role [29] minimizes the energy used by unavoidable machine idle waiting time. When the energy consumption during machine standby idling exceeds that of turning the machine off and back on, the optimal choice is to temporarily shut down the machine and restart it just before the next job is dealt with. Instead, the machine is kept in standby mode.

3. Q-MGCOA Algorithm

To address the proposed mathematical model, this study combines Q-learning’s learning mechanism with multi-group optimization, proposing the Q-MGCOA algorithm to dynamically select search operators. Figure 2 illustrates the overall architecture of the proposed algorithm.

3.1. Encoding and Decoding

The coding scheme for the TDEHFSP problem needs to include four decisions: job sequencing, machine assignment, transport vehicles assignment, and machine switch selection. To address these needs, an effective coding scheme is employed that encompasses arrangement, ordering, and turn-on/off decision determinations throughout every phase. The coding scheme encodes only the sequence of jobs π = {1,2,…,n} and uses list scheduling (LS) to generate job alignments for subsequent stages.
LS allocates tasks to machines and transporters using the Earliest Completion Time (ECT) strategy. It is important to note that in a unified parallel machine setup, the ECT approach generates distinct scheduling outcomes based on the first available machine (FAM) principle. Even when a job is directed to the first available machine and transporter, its completion time may still exceed the ECT-predicted value due to variations in processing speeds across uniform parallel machines. Under the LS framework, jobs are sequentially processed, with each task assigned to the machine and transporter at the earliest available completion time. This mechanism inherently aligns with the intention to minimize the total makespan. The job list for subsequent stages is dynamically updated based on the completion times of tasks from prior stages.
The active decoding process in this study incorporates two distinct methodologies: initial decoding and rescheduling decoding. The decoding method discussed above corresponds to the initial decoding phase, which constructs a full scheduling plan. Rescheduling decoding serves as an extension of the initial approach, specifically designed to handle adjustments following dynamic events such as unexpected task arrivals or machine failures.

3.2. Initialization

This section presents a novel NEH-based initialization method tailored for the TDEHFSP. Unlike traditional NEH algorithms [30] that focus solely on minimizing makespan, our approach integrates multi-objective optimization by simultaneously considering C m a x and TEC. The initialization process consists of two distinct steps:
Step 1: Dual-Objective Initial Population Generation.
Objective 1: Generate an individual using C m a x as the minimization target, following the classic NEH heuristic.
Objective 2: Generate another individual using TEC as the minimization target, incorporating energy-aware scheduling rules. This dual-objective approach ensures a diverse initial population that balances efficiency and sustainability.
Step 2: Residual Population Construction.
The remaining individuals in the population are randomly generated to maintain genetic diversity. This step prevents premature convergence and enhances the algorithm’s exploration capability.

3.3. Genetic Operation and Elite Preservation

Evolutionary processes, such as recombination and variation, serve as the foundation for investigating the feasible region and averting suboptimal local solutions. Because of the ordered nature of job sequences, this paper uses two-point order crossover and swap sequence mutations that have been widely used in HFSP.
The assessment of multi-solution quality utilizes rapid non-dominated sorting (FNS) and congestion distance estimation to formulate the Pareto dominance comparator for solution evaluation. The operators “≺” and “≻” symbolize “less optimal” and “more optimal”, respectively. In minimization-focused problems, a reduced objective value equates to an improved solution, and conversely.
Definition 1.
If a solution x 1   Pareto is better than another solution   x 2 , it obeys
f j x 1 f j x 2 , j 1,2 , 3 .
Definition 2.
If a solution i   is better than  j , it obeys
i n : i f i r a n k < j r a n k o r i r a n k = j r a n k   a n d   i d i s t a n c e > j d i s t a n c e ,
where i r a n k   and   i d i s t a n c e are the solution rank and congestion distance, respectively. f j   is the objective function.

3.4. Neighborhood Structure and Local Search

In the proposed Q-MGCOA algorithm, the search process is driven by Q-learning integrated into the multi-group particle swarm optimization (MGPSO) framework, which selects the most suitable search operators during the evolutionary process [8]. To adjust the search range and strategy of the particles in real-time, a simple perturbation operator is introduced as the neighborhood structure, i.e., Destruction–Construction (DC). In addition, the MGPSO extends the velocity update equation to include a collaborative archive mechanism to facilitate information exchange between subpopulations.
Adopting a collaborative archive mechanism in local search, where particles refer to solutions in the archive to adjust their positions and velocities, facilitates information exchange between subpopulations and further enhances the efficacy of localized search. The local search operator is key to improving the method. This paper uses three search operators, namely critical swap (CSwap), critical insertion (CInsr), and critical inverse (CInv).
The MGPSO extended velocity update equation can be stated as
V i t + 1 = ω V i t + c 1 r 1 y i t x i t + λ i c 2 r 2 y ^ i t x i t + ( 1 λ i ) c 3 r 3 ( a ^ i t x i t ) ,
In this equation, V i t denotes the velocity of particle i at iteration t , ω represents the inertia weight c 1 , c 2 , and c 3 symbolize acceleration coefficients, and r 1 , r 2 , and r 3 are arbitrary vectors with components uniformly distributed between 0 and 1. The terms x i t and y i t   i signify the position of particle i and its optimal position at iteration t -th, respectively. Furthermore, y ^ i t and a ^ i t are the global best position and neighborhood optimal position, respectively, while λ i stands for the exploitation trade-off coefficient associated with particle i .
This study redefines the state and action sets of Q-learning. The state set S1 represents different motion modes and damage–construction strategies of the particle swarm. For example, different inertia weights, velocity updating rules, and DC strategies can form the state set S 1 , denoted as S 1 = { I W 1 , I W 2 ,   , I W N ,   D 1 ,   D 2 ,   , D M } . On the other hand, the set of states represents different methods of local search operations, i.e., S 2 = {CSwap, CInsr, CInv,}. The action set is the same as the state set, i.e., A 1   =   S 1 and A 2 =   S 2 . For detailed information, refer to Figure 3.
To address multiple objectives in reward allocation, the reward function is adapted to a multi-objective version through normalization, together with the total number. The new reward function not only considers the improvement of the objective function but also introduces the investigation of the improvement of the method in the archive. Specifically, the reward function can be stated as
r t = k f k m i n f k x t f k x t f k x t + 1 f k m a x f k m i n + β × a r c h i v e   ,
where x t is the solution at time t, and x t + 1 is the new solution. f k (   ) denotes the k -th objective function, f k m a x and f k m i n are the worst and best values of the currently found k -th objective function, respectively. a r c h i v e denotes the improvement of the new solution on the set of solutions in the archive, and β is the moderating factor.
Algorithm 1 is the detailed procedure of MGPSO with Q-learning, made up of two steps. Step 1 is particle search and Q1 training. Evaluate the new position of each particle perturbation and update the Q1 table. Step 2 is the local search and Q2 training phase, which utilizes the collaborative archive to guide the local search and update Q2.
Algorithm 1: MGPSO with Q-Learning process
Input: parameters learning factor α , discount factor γ , Epsilon-greedy factor ε ,
initial solution x 0 , state set S 1 , S 2 , action set A 1 , A 2 , the number of episodes E ,
Non-progressing iterations highest number: Iter_Max
Output: Best solution x * , Q-tables Q1, Q2
1: Initialise Q-tables Q1, Q2 as zero matrices
2: Initialise global best solution x * = x 0
3: Divide particles into M subpopulations
4: Initialise particle positions and velocities
Initialise personal best positions pi for each particle
Initialise neighbourhood structure for each particle
# Particle Search and Q1 Training
5: For t = 1 : Max-iter do
6:  For each subpopulation m in 1 : M do
7:    For each particle i in subpopulation m do
8:      Select action a 1 from A 1 using Epsilon-greedy(Q1)
9:      Apply Destruction-Construction to particle i based on a 1
10:      Evaluate new position x i
11:      Update personal best position pi if x i is better
12:      Calculate reward r
13:      Update Q1 using Q-learning ( α , γ , s 1 , a 1 , r )
14:    End for
15:  End for
# Local Search and Q2 Training
16:  Identify and update global best solution x *
17:  Update collaborative archive with new non-dominated solutions
18:  For each particle i in 1 : P do
19:    Select action a 2 from A 2 using Epsilon-greedy(Q2)
20:    Apply local search operator to particle i based on a 2
21:    Evaluate new position x i
22:    Update personal best position pi if x i is better
23:    Calculate reward r
24:    Update Q2 using Q-learning ( α , γ , s 2 , a 2 , r )
25:  End for
# Update Velocity and Position
26:  For each particle i in 1 : P do
27:    Update velocity v i and position x i using PSO equations and collaborative archive
28:  End for
# Check for Stopping Criteria
29:  If no improvement in x * for Max-iter iterations then
30:    Break 31:  End if
32: End for
33: Return best solution x *
The complexity of DC is O   ( d + d n j = 1 s l j ) , j = 1 s l j signifies the machine number. The worst-case complexity of the Q-learning process is O   ( d m a x ) . The complexity of the three local search operators in computing the critical path is O   ( n j = 1 s l j ) . The complexity of the Q-learning update is O   ( 3 ). So, the total complexity is O   ( E ( d + d n j = 1 s l j + d m a x + n j = 1 s l j + 3 ) . The simplified complexity is O   ( n j = 1 s l j ) .
The Q-MGCOA algorithm proposes an event-triggered reinforcement learning rescheduling model for dynamic events, integrating offline and online modes to address static scheduling and dynamic scheduling with new job arrivals, respectively. For the offline mode, an MGPSO algorithm based on perturbation operators and a Q-learning-driven mutation mechanism is employed to drive population evolution while transitioning to the online mode upon detecting dynamic events. In the online mode, three rescheduling heuristic algorithms are activated to respond to dynamic events and adaptively adjust the original scheduling. The complete workflow is illustrated in Figure 4.

4. Experiment Validation and Result Analysis

This section generates benchmark test instances of the TDEHFSP problem, followed by systematic experimentation of parameter adjustments identifies crucial parameters of Q-MGCOA.

4.1. Experiment Settings

Due to scarce prior research on TDEHFSP benchmark instances, random test cases were produced. Instances follow the format “jobs-stages-machines per stage-new job arrivals”. For instance, 10-3-4-2 represents 10 jobs, 3 stages, 4 machines each, and 2 new job arrivals, and 3 new job arrivals, expressed as “10-3-4-3”. These instances are abstracted from the manufacturing process of container production—specifically, the assembly of containers. The simplified process includes five stages: material preparation, cutting, welding (machine welding and manual touch-up), assembly, and painting. Table 2 provides the experimental setup (Zhang et al. [26]).
The parameter settings in the present study are based on classic research in the field of hybrid flow shop scheduling (Luo et al. [31]), as these scales effectively balance computational complexity and the diversity of real-world production scenarios.

4.2. Evaluation Indicators

GD, IGD, and Spread are adopted to evaluate the convergence, comprehensiveness, and distribution uniformity of Pareto fronts, respectively. These metrics are widely used in multi-objective optimization to holistically assess algorithm performance.
Spacing Matrix (SM): A lower value of SM shows a more equitable distribution of solutions of Pareto frontier solutions. The formula is as follows:
S M = i = 1 A ( d m i n i d m i n i ¯ ) 2 | A |   ,
where d m i n i represents the minimal aggregation of absolute discrepancies in standardized goal function outputs of the i -th solution in the obtained non-dominated set A and any other solution, and d m i n i ¯ denotes the average value of d m i n i .
Generation Distance (GD) represents the average distance between the obtained Pareto front ( P F a ) and the actual Pareto front ( P F * ), reflecting the degree of proximity between them. The formula is as follows:
G D = i = 1 n P F a d i 2 n P F a ,
where d i signifies the smallest Euclidean distance between the i -th individual in P F a and an individual in P F * . While n P F a is the count of individuals in P F a . This study employs the maximum minimization technique for normalization, with P F * symbolizing the combined Pareto front derived from all algorithms in the given scenario. The max–min normalization technique scales objective values to [0, 1] for fair comparison across heterogeneous objectives. Compared to Z-score normalization, this method reduces computational instability.
The inverse generational distance (IGD) serves as an inclusive metric for assessing the aggregation and variability dimensions of a resolution collection produced by an algorithm. The expression is below:
I G D = i = 1 n P F * d i * n P F *
where d i * is the smallest Euclidean distance between the i -th solution in P F * and an individual in P F a . n P F * is the number of individuals in P F * .
The spatial metric (Spread) quantifies the dispersion of solutions along the Pareto front, indicating the variety within the acquired set. A reduced Spread value implies a more even dispersion of outcomes, thereby suggesting enhanced diversity in the solution set generated by the optimization method. The expression is as follows:
S p r e a d = d f + d l + i = 1 n 1 | d i i + 1 d i i + 1 | ¯ d f + d l + n 1 d i i + 1 ¯ ,
Here, d f represents the Euclidean distance between the smallest and largest objective function values in the obtained set, while d l signifies the distance from the nearest solution to the true Pareto frontier. d i i + 1 is the Euclidean distance between adjacent solutions within the set, and d i i + 1 ¯ is the mean distance across neighboring solution pairs.

4.3. The Result of Parameter Tuning

To obtain the optimal parameter combination more quickly, we set up three groups of parameter combinations (low, medium, high) for the Taguchi experiment, including termination threshold (Q- i t e r ), population size ( P s ), mutation rate ( M r ), crossover rate ( C r ), and computational budget coefficient (Ct). The orthogonal array of parameter settings is shown in Table 3, with five factors selected, each at three levels (Zhang et al. [26]). Peize et al. [24] obtained the optimal parameter settings for solving this type of problem as follows: the number of generations M a x _ G e n = 100 , learning factor α = 0.1 , offset factor ε = 0.1 , discounting factor λ = 0.1 , and maximum DC depth is 8.
Parametric analysis in Figure 5 demonstrates the dominant influences of Ct and Q- i t e r over P s on optimization performance for the 20-3-3-3 instance. The calibrated Q-MGCOA configuration achieves optimality at P s = 120 , M r = 0.9 , C r = 0.3 , Q - i t e r = 3 , and C t = 0.8 . Benchmark tests enforce standardized settings: uniform genetic operators, P s = 120 , and C t × n × m termination criterion to ensure equitable comparisons.

4.4. Algorithm Comparison and Analysis

Contrasting Q-MGCOA with traditional NSGA-II [25] and multi-objective evolutionary algorithms (MOEAs) like improved Jaya [32] and MOEA/D [33]. Reveals performance differences in optimization tasks.
Across the 36 produced scenarios, every algorithm underwent execution independently for five iterations. The AVG and STD of the evaluation metrics GD, IGD, and Spread were computed for each algorithm. The analysis outcomes are presented in Table 4, Table 5 and Table 6, where bolded values indicate the best performance among the three, and the hit rate reflects how often each algorithm achieved the best results across all instances. Box plots of the GD, IGD, and Spread test results across different algorithms are displayed in Figure 6. From the GD results in Table 4, Q-MGCOA achieves 27 optimal AVGs and 28 optimal STDs, indicating that Q-MGCOA can obtain a solution set closer to the optimal Pareto boundary.
Table 4 compares the Generation Distance (GD) of NSGA-II, Jaya, MOEA/D, and Q-MGCOA across 36 scenarios. GD measures the proximity to the true Pareto front. Q-MGCOA shows superiority, achieving 27 optimal AVGs and 28 optimal STDs, indicating it can obtain a solution set closer to the optimal Pareto boundary.
Table 5 presents the inverse generational distance (IGD) analysis of four algorithms in 36 scenarios. IGD assesses solution set quality comprehensively. Q-MGCOA performs best with 25 optimal AVGs and 26 optimal STDs, suggesting it generates Pareto frontiers with better convergence and diversity.
Table 6 displays the Spread metric results for NSGA-II, Jaya, MOEA/D, and Q-MGCOA in 36 cases. Spread measures solution dispersion on the Pareto front. Q-MGCOA gets the majority of optimal AVGs and STDs, meaning its Pareto frontiers have better dispersal unification.
The boxplot in Figure 6 not only validates the aforementioned conclusions but also visually demonstrates the superiority of the proposed algorithm across the three evaluation metrics: GD, IGD, and Spread. The results show that Q-MGPSO-NSGA-II performs well across three evaluation metrics, outperforming in approximately 75%, 69%, and 81% of instances. In summary, compared to NSGA-II, Jaya, and MOEA/D, Q-MGCOA demonstrates superior overall performance, assessing the efficacy and prominence of the method for solving TDEHFSP problems.
Figure 7 visualizes the convergence performance of the algorithms by plotting the IGD values over time for three instances: 10-5-3-3, 20-5-3-3, and 50-5-3-3. The IGD metric was selected because it is a comprehensive metric. As shown in Figure 7a, Q-MGCOA converges to the minimum value faster in the small-scale instances, indicating a well-balanced global exploration and local exploitation. Figure 7b shows that Q-MGCOA slightly outperforms NSGA-II, consistent with the results in Table 6 for medium-scale instances. Additionally, Figure 7c shows that the convergence curve of Q-MGCOA is smoother than that of the other algorithms, demonstrating its robustness in solving large-scale instances. In summary, these results suggest that Q-MGCOA is more efficient in solving TDEHFSP problems.
To further evaluate the algorithm’s performance, we selected three test instances, namely 10-5-3-3, 20-5-3-3, and 50-5-3-3, for the experiments. The Pareto front comparison diagram obtained is shown in Figure 8. As shown in the image, the scheduling outcomes obtained by the Q-MGPSO algorithm demonstrate significantly lower values in both maximum makespan and total energy consumption compared to the Jaya and MOEA/D algorithms. On the other hand, while the NSGA-II algorithm can achieve a lower maximum makespan, at the same level of maximum makespan, the scheduling results produced by the Q-MGPSO algorithm reduce the total energy consumption. As shown in Table 7, the minimum energy consumption (TEC) values achieved by Q-MGCOA are 8%, 8%, and 6% lower than those of NSGA-II for the three instances. Therefore, this further confirms the efficacy of the proposed energy-saving strategies and rescheduling. Moreover, this validates that the proposed Q-MGPSO algorithm can explore more feasible solutions without losing the current optimal solutions, demonstrating strong optimization capabilities and delivering practical guidelines for smart, green container-terminal operations.

5. Conclusions

Optimizing energy-efficient job sequencing in container-oriented hybrid flow shops has become critical for container production resilience and sustainability. First, a MINLP model for the Resilient Container-supply TDEHFSP (RC-TDEHFSP) is established, jointly minimizing makespan, total energy consumption and a novel supply-chain resilience index, and an energy-efficient strategy to reduce T E C without affecting productivity is proposed. To address the scheduling challenges caused by the delayed introduction of new jobs in TDEHFSP, a novel algorithm, Q-MGCOA is developed to solve this model by leveraging the strengths of multiple algorithms, ensuring adaptability and efficiency in dynamic scenarios.
To evaluate the viability of the suggested method, the optimized Q-MGCOA is compared with NSGA-II, improved Jaya, and MOEA/D in this paper. Compared to the other three algorithms, the percentage of performance dominance of Q-MGCOA on the three evaluated metrics (GD, IGD, and Spread) is about 75%, 69%, and 81% in 36 computational instances of different sizes. Moreover, it achieves about an 8% reduction in energy consumption compared to traditional algorithms. The comparative analysis reveals that the algorithm demonstrates superior performance in addressing TDEHFSP, notably outperforming the other three approaches.
Still, this study has several notable limitations. First, the scheduling framework exclusively accounts for dynamic job arrivals as the primary uncertainty factor, neglecting potential disruptions from other sources. Second, the material transfer analysis focuses solely on transport vehicles without incorporating auxiliary handling systems like overhead cranes that play vital roles in specialized production environments. Subsequent investigations should expand the model’s adaptability by integrating complementary dynamic variables and diversified material transfer mechanisms.

Author Contributions

Methodology, H.S. (Huaixia Shi) and H.S. (Huaqiang Si); Validation, H.S. (Huaixia Shi) and H.S. (Huaqiang Si); Formal analysis, H.S. (Huaixia Shi); Resources, H.S. (Huaqiang Si); Data curation, H.S. (Huaixia Shi) and H.S. (Huaqiang Si); Writing—original draft, H.S. (Huaixia Shi); Writing—review and editing, J.Q. (Jiyun Qin); Visualization, H.S. (Huaixia Shi) and H.S. (Huaqiang Si). All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Wang, Y.J.; Li, J.; Wang, G.G. Fuzzy correlation entropy-based NSGA-II for energy-efficient hybrid flow-shop scheduling problem. Knowl.-Based Syst. 2023, 277, 110808. [Google Scholar] [CrossRef]
  2. Wang, Z.; Shen, L.; Li, X.; Gao, L. An improved multi-objective firefly algorithm for energy-efficient hybrid flowshop rescheduling problem. J. Clean. Prod. 2023, 385, 135738. [Google Scholar] [CrossRef]
  3. Goli, A.; Ala, A.; Hajiaghaei-Keshteli, M. Efficient multi-objective meta-heuristic algorithms for energy-aware non-permutation flow-shop scheduling problem. Expert Syst. Appl. 2023, 213, 119077. [Google Scholar] [CrossRef]
  4. Geng, K.; Liu, L.; Wu, Z. Energy-efficient distributed heterogeneous re-entrant hybrid flow shop scheduling problem with sequence dependent setup times considering factory eligibility constraints. Sci. Rep. 2022, 12, 18741. [Google Scholar] [CrossRef]
  5. Ho, M.H.; Hnaien, F.; Dugardin, F. Exact method to optimise the total electricity cost in two-machine permutation flow shop scheduling problem under Time-of-use tariff. Comput. Oper. Res. 2022, 144, 105788. [Google Scholar] [CrossRef]
  6. Cui, D.; Zheng, J.; Sun, Y.; He, X.; Zhao, Z. Deploying Liquefied Natural Gas-Powered Ships in Response to the Maritime Emission Trading System: From the Perspective of Shipping Alliances. J. Mar. Sci. Eng. 2025, 13, 551. [Google Scholar]
  7. Zhang, F.; Mu, H.; Wang, Z.; Chen, J.; Zhang, G.; Wang, S. A Flow Shop Scheduling Method Based on Dual BP Neural Networks with Multi-Layer Topology Feature Parameters. Systems 2024, 12, 339. [Google Scholar] [CrossRef]
  8. Tao, X.R.; Pan, Q.K.; Sang, H.Y.; Gao, L.; Yang, A.-L.; Rong, M. Nondominated sorting genetic algorithm-II with Q-learning for the distributed permutation flowshop rescheduling problem. Knowl.-Based Syst. 2023, 278, 110880. [Google Scholar] [CrossRef]
  9. Jiang, C.; Wang, Z.; Chen, S.; Li, J.; Wang, H.; Xiang, J.; Xiao, W. Attention-shared multi-agent actor–critic-based deep reinforcement learning approach for mobile charging dynamic scheduling in wireless rechargeable sensor networks. Entropy 2022, 24, 965. [Google Scholar] [CrossRef]
  10. Luo, H.; Fang, J.; Huang, G.Q. Real-time scheduling for hybrid flowshop in ubiquitous manufacturing environment. Comput. Ind. Eng. 2015, 84, 12–23. [Google Scholar] [CrossRef]
  11. Ghaleb, M.; Zolfagharinia, H.; Taghipour, S. Real-time production scheduling in the Industry-4.0 context: Addressing uncertainties in job arrivals and machine breakdowns. Comput. Oper. Res. 2020, 123, 105031. [Google Scholar] [CrossRef]
  12. Khan, M.A.; Ahmed, T.; Ashraf, S.; Arfeen, Z.A. SLM-OJ: Surrogate Learning Mechanism during Outbreak Juncture. August 2022 2020, 6, 162–167. [Google Scholar]
  13. Al-Hinai, N.; ElMekkawy, T.Y. Robust and stable flexible job shop scheduling with random machine breakdowns using a hybrid genetic algorithm. Int. J. Prod. Econ. 2011, 132, 279–291. [Google Scholar] [CrossRef]
  14. Duan, J.; Wang, J. Robust scheduling for flexible machining job shop subject to machine breakdowns and new job arrivals considering system reusability and task recurrence. Expert Syst. Appl. 2022, 203, 117489. [Google Scholar] [CrossRef]
  15. Fujimura, S.; Plazolles, B.; Luo, J.; El Baz, D. GPU based parallel genetic algorithm for solving an energy efficient dynamic flexible flow shop scheduling problem. J. Parallel Distrib. Comput. 2019, 133, 244–257. [Google Scholar]
  16. Ghaleb, M.; Taghipour, S.; Zolfagharinia, H. Real-time integrated production-scheduling and maintenance-planning in a flexible job shop with machine deterioration and condition-based maintenance. J. Manuf. Syst. 2021, 61, 423–449. [Google Scholar] [CrossRef]
  17. Li, W.; Gao, L.; Han, D.; Li, Y.; Li, X. Integrated production and transportation scheduling method in hybrid flow shop. Chin. J. Mech. Eng. 2022, 35, 12. [Google Scholar] [CrossRef]
  18. Gong, G.; Zhao, X.; Luo, Q.; Guo, X.; Chen, L.; Deng, Q. Modelling and optimisation of distributed assembly hybrid flowshop scheduling problem with transportation resource scheduling. Comput. Ind. Eng. 2023, 186, 109717. [Google Scholar]
  19. Cai, J.; Wang, J.; Wang, L.; Lei, D. A novel shuffled frog-leaping algorithm with reinforcement learning for distributed assembly hybrid flow shop scheduling. Int. J. Prod. Res. 2023, 61, 1233–1251. [Google Scholar] [CrossRef]
  20. Peng, K.-K.; Gao, L.; Zhang, B.; Pan, Q.-K.; Meng, L.-L.; Li, X.-Y. A three-stage multiobjective approach based on decomposition for an energy-efficient hybrid flow shop scheduling problem. IEEE Trans. Syst. Man Cybern. Syst. 2019, 50, 4984–4999. [Google Scholar]
  21. Zhang, Z.Q.; Qian, B.; Hu, R.; Yang, J.-B. Q-learning-based hyper-heuristic evolutionary algorithm for the distributed assembly blocking flowshop scheduling problem. Appl. Soft Comput. 2023, 146, 110695. [Google Scholar] [CrossRef]
  22. Zhang, Z.Q.; Wu, F.C.; Qian, B.; Hu, R.; Wang, L.; Jin, H.-P. A Q-learning-based hyper-heuristic evolutionary algorithm for the distributed flexible job-shop scheduling problem with crane transportation. Expert Syst. Appl. 2023, 234, 121050. [Google Scholar] [CrossRef]
  23. Jia, Y.; Yan, Q.; Wang, H. Q-learning driven multi-population memetic algorithm for distributed three-stage assembly hybrid flow shop scheduling with flexible preventive maintenance. Expert Syst. Appl. 2023, 232, 120837. [Google Scholar] [CrossRef]
  24. Xue, Q.; Zhou, D.; Zhang, Z.; Chen, J.; Li, P. Multi-objective energy-efficient hybrid flow shop scheduling using Q-learning and GVNS driven NSGA-II. Comput. Oper. Res. 2023, 159, 106360. [Google Scholar]
  25. Zhang, W.; Li, C.; Gen, M.; Yang, W.; Zhang, G. A multiobjective memetic algorithm with particle swarm optimisation and Q-learning-based local search for energy-efficient distributed heterogeneous hybrid flow-shop scheduling problem. Expert Syst. Appl. 2024, 237, 121570. [Google Scholar] [CrossRef]
  26. Zhang, Q.; Duan, J.; Qin, J.; Zhou, Y.; Shi, H.; Nie, L.; Si, H. Double Deep Q-Network-Based Solution to a Dynamic, Energy-Efficient Hybrid Flow Shop Scheduling System with the Transport Process. Systems 2025, 13, 170. [Google Scholar] [CrossRef]
  27. Zhang, W.; Gen, M.; Jo, J. Hybrid sampling strategy-based multiobjective evolutionary algorithm for process planning and scheduling problem. J. Intell. Manuf. 2014, 25, 881–897. [Google Scholar] [CrossRef]
  28. Chen, X.; Li, Y.; Zhang, L.; Gao, K.; An, Y. Multiobjective flexible job-shop rescheduling with new job insertion and machine preventive maintenance. IEEE Trans. Cybern. 2022, 53, 3101–3113. [Google Scholar]
  29. Li, Z.; Mumtaz, J.; Yue, L.; Wang, H.; Rauf, M. Energy-efficient scheduling of a two-stage flexible printed circuit board flow shop using a hybrid Pareto spider monkey optimisation algorithm. J. Ind. Inf. Integr. 2023, 31, 100412. [Google Scholar]
  30. Meyer, P.; Mohammadi, M.; Pasdeloup, B.; Karimi-Mamaghan, M. Learning to select operators in meta-heuristics: An integration of Q-learning into the iterated greedy algorithm for the permutation flowshop scheduling problem. Eur. J. Oper. Res. 2023, 304, 1296–1330. [Google Scholar]
  31. Luo, H.; Du, B.; Huang, G.Q.; Chen, H.; Li, X. Hybrid flow shop scheduling considering machine electricity consumption cost. Int. J. Prod. Econ. 2013, 146, 423–439. [Google Scholar] [CrossRef]
  32. Li, Z.; Gao, K.; Wu, N.; Pan, Y. Solving biobjective distributed flow-shop scheduling problems with lot-streaming using an improved Jaya algorithm. IEEE Trans. Cybern. 2022, 53, 3818–3828. [Google Scholar]
  33. Li, X.; Li, P.; Wang, G.; Gao, L. Energy-efficient distributed heterogeneous welding flow shop scheduling problem using a modified MOEA/D. Swarm Evol. Comput. 2021, 62, 100858. [Google Scholar]
Figure 1. The layout of a hybrid flow shop considering transportation processes.
Figure 1. The layout of a hybrid flow shop considering transportation processes.
Jmse 13 01153 g001
Figure 2. The overall framework of the Q-MGCOA algorithm.
Figure 2. The overall framework of the Q-MGCOA algorithm.
Jmse 13 01153 g002
Figure 3. States S2 and actions A2 of Q-learning.
Figure 3. States S2 and actions A2 of Q-learning.
Jmse 13 01153 g003
Figure 4. The dynamic and static rescheduling model framework.
Figure 4. The dynamic and static rescheduling model framework.
Jmse 13 01153 g004
Figure 5. Main effect plot of parameter tuning.
Figure 5. Main effect plot of parameter tuning.
Jmse 13 01153 g005
Figure 6. The boxplot of GD (a), IGD (b), and Spread (c) results.
Figure 6. The boxplot of GD (a), IGD (b), and Spread (c) results.
Jmse 13 01153 g006
Figure 7. The convergence curve of 10-5-3-3 (a), 20-5-3-3 (b), and 50-5-3-3 (c).
Figure 7. The convergence curve of 10-5-3-3 (a), 20-5-3-3 (b), and 50-5-3-3 (c).
Jmse 13 01153 g007
Figure 8. The Pareto front comparison diagram of 10-5-3-3 (a), 20-5-3-3 (b), and 50-5-3-3 (c).
Figure 8. The Pareto front comparison diagram of 10-5-3-3 (a), 20-5-3-3 (b), and 50-5-3-3 (c).
Jmse 13 01153 g008
Table 1. Notations of present study.
Table 1. Notations of present study.
ParametersDefinitionParametersDefinition
n Sum of jobs number. e The e -th transport task, ( e = 1 , , g )
s Sum of stages number h j Transport vehicle i n d e x , ( h j = 1 , , q )
m Sum of corresponding workshop machines v z n , v z l No-load/load transport speed of transport vehicles
j Stage index ( j = 1 , , s ) O i j j -th operation of job i
i Job index ( i = 1 , , n ) x b Horizontal coordinate of the buffer
N _ i New job index ( N _ i = 1 , , r ) ( x k j , y k j ) Coordinates of k j
k j M a c h i n e i n d e x ( k j = 1 , , m ) P k j , I P k j Processing/standby power of k j
C i Transportation time of the first job C i s Ending time of final stage jobs
P Z n , P Z l No-load/load transport power of transport vehicles P k j o n , P k j o f f Turn-on/off power of k j
T k j o n , T k j o f f Turn-on/off time of k j T k j Time constant
T i j i * j k j Idle time of k j T N _ i o n Time to start processing of new jobs
t i j k j Processing time of O i j on k j T N _ i A Arrival time of new jobs
N S T i j , N E T i j Starting/ending time of the no-load transport task O i j L S T i j , L E T i j Starting/ending time of the load transport task O i j
Variables
a i j k j 1 if the operation O i j is dealt with on the machine k j , 0 otherwise.
b i * k j 1 if job i * is the subsequent job of job i dealt with by machine k j , 0 otherwise.
c i k j h j 1 if job i is dealt with on the machine k j and is transported by the transport vehicle h j , 0 otherwise.
d i e i * e * h j * 1 if operation   O i * j * is O i j is a post-operation dealt with on machine k j , 0 otherwise.
e i k j k j * 1 if the subsequent operation of job i after processing in machine k j is carried out in machine k j * , 0 otherwise.
Table 2. Test example parameter settings (Zhang et al. [26]).
Table 2. Test example parameter settings (Zhang et al. [26]).
FactorsLevels
Sum of jobs{10, 20, 50}
Sum of stages{3, 5}
Sum of machines per stage{3, 4, 5}
Number of new jobs arrived {3, 5}
Number of transport vehicles per stage2
Processing time per operation U (20, 30) min
Power of machine U (5, 10) × 10 3 W
Speed of no-load/load transport vehicles30\20 m/min
Power of no-load/load transport vehicles100\200 W
Standby power of the machine2
Reset power of the machine4
Table 3. Parameter levels (Zhang et al. [26]).
Table 3. Parameter levels (Zhang et al. [26]).
FactorsFactor Levels
123
C t 0.50.81.2
M r 0.60.80.9
Q- i t e r 235
C r 0.10.30.5
P s 80100120
Table 4. The analysis outcome of GD.
Table 4. The analysis outcome of GD.
NSGA-IIJayaMOEA/DQ-MGCOA
AVGSTDAVGSTDAVGSTDAVGSTD
10-3-3-30.01960.00840.03720.00710.06140.01790.02530.0082
10-3-3-50.02390.00660.04040.02390.04840.01550.03840.0209
10-3-4-30.02690.00510.05730.02770.04570.01360.01870.0034
10-3-4-50.02900.01230.07630.02550.06140.01910.02180.0022
10-3-6-30.06240.01380.04060.02060.05520.01040.02050.0113
10-3-6-50.02360.00510.03970.01790.07160.04000.01850.0039
10-5-3-30.06780.06300.08580.04120.05310.00700.01640.0021
10-5-3-50.03320.01450.06160.02550.05010.01860.02320.0078
10-5-4-30.02080.00540.10470.06270.05150.02380.02320.0084
10-5-4-50.02040.00640.04590.02000.17320.17320.01890.0020
10-5-6-30.04490.03640.06490.03720.07700.05860.02070.0022
10-5-6-50.02950.00640.07450.05720.17320.17320.01470.0021
20-3-3-30.01680.00500.02720.00550.06110.04630.01750.0028
20-3-3-50.01760.00260.02650.00590.06590.06020.01910.0068
20-3-4-30.01580.00330.03440.00640.06610.06040.01770.0032
20-3-4-50.01840.00380.04510.01700.06410.02040.01730.0023
20-3-6-30.02250.00790.04650.02560.04760.02880.01850.0022
20-3-6-50.02520.00880.05340.02540.04240.01190.03090.0101
20-5-3-30.03890.03180.04590.01410.07710.02090.01790.0033
20-5-3-50.02930.01530.04540.01330.07340.05700.01660.0032
20-5-4-30.03050.00830.07470.02770.07650.05520.02150.0044
20-5-4-50.02160.00500.06430.02670.09560.06120.02060.0045
20-5-6-30.02130.00320.04010.03160.11830.03350.01620.0039
20-5-6-50.02920.00810.17320.17320.05510.02510.01840.0049
50-3-3-30.01500.00450.05160.02270.09100.05220.01770.0038
50-3-3-50.01620.00210.07570.05710.10810.04770.01760.0051
50-3-4-30.02600.00820.03860.00540.07490.02500.01720.0039
50-3-4-50.01920.00730.05800.02880.06620.01750.01590.0035
50-3-6-30.01620.00390.03900.01250.06720.01750.01510.0077
50-3-6-50.01440.00220.05390.02430.05220.01940.01340.0013
50-5-3-30.02190.00480.07810.05630.07320.02730.01320.0039
50-5-3-50.02170.00720.03680.00720.06740.01380.01600.0018
50-5-4-30.02820.01010.17320.17320.08010.05780.01690.0033
50-5-4-50.01700.00450.03580.00950.04760.01530.01250.0018
50-5-6-30.01580.00440.03120.00580.09120.04960.01210.0013
50-5-6-50.02480.01140.07820.05340.05070.01610.02040.0122
Hit rate9/368/360/360/360/360/3627/3628/36
Table 5. The analysis outcome of IGD.
Table 5. The analysis outcome of IGD.
NSGA-IIJayaMOEA/DQ-MGCOA
AVGSTDAVGSTDAVGSTDAVGSTD
10-3-3-30.04380.00610.14140.03030.26100.07170.06960.0045
10-3-3-50.08210.02110.18180.11080.21320.07060.12400.0392
10-3-4-30.08980.03880.21510.14560.20770.06970.07970.0192
10-3-4-50.07830.03360.32300.11920.29840.10780.07240.0100
10-3-6-30.04360.01060.13480.05010.24700.07690.05460.0195
10-3-6-50.06490.01300.17420.09080.29390.16980.06330.0171
10-5-3-30.26840.35710.33790.13830.21180.06270.06370.0074
10-5-3-50.11820.06440.27540.11940.22190.08530.09240.0295
10-5-4-30.06300.02010.52560.34280.22480.14330.08280.0232
10-5-4-50.06780.01420.19070.06720.90000.90000.07900.0055
10-5-6-30.11440.10330.27490.18190.38080.31200.06980.0088
10-5-6-50.08800.03340.35700.31190.90000.90000.05430.0033
20-3-3-30.05980.02110.11730.02790.25830.17920.06690.0146
20-3-3-50.07610.02170.12210.03040.30860.33120.08320.0302
20-3-4-30.06700.01590.16040.03900.31450.32830.07040.0158
20-3-4-50.05270.00430.19620.09740.28740.08330.05990.0095
20-3-6-30.08700.03220.21200.13450.20980.13500.06770.0139
20-3-6-50.07160.03330.22240.11970.18250.04910.12470.0544
20-5-3-30.15000.12640.22090.06470.32400.08450.07280.0151
20-5-3-50.13000.08550.18450.06880.35140.31140.07150.0163
20-5-4-30.12020.03150.34050.13050.36560.30220.08190.0282
20-5-4-50.08750.01020.28520.13020.43740.30310.08670.0239
20-5-6-30.09830.02060.18300.15160.56810.19270.06570.0109
20-5-6-50.10840.03030.90000.90000.23800.10720.07170.0245
50-3-3-30.06110.02220.24660.12330.44740.27620.07110.0136
50-3-3-50.05760.00630.36770.30660.51040.25270.07050.0201
50-3-4-30.09760.03640.17730.03130.32530.14630.07400.0132
50-3-4-50.07230.02040.26780.11820.27760.06970.06570.0160
50-3-6-30.06330.02160.18670.07100.28800.08620.06070.0341
50-3-6-50.06190.01220.25380.12520.21830.07590.05610.0070
50-5-3-30.09170.02530.38590.29890.33290.13870.05620.0201
50-5-3-50.09400.03220.16520.03670.30440.07360.06750.0077
50-5-4-30.12590.04530.90000.90000.38860.30590.07010.0173
50-5-4-50.07340.02470.16880.05100.21280.07550.05490.0110
50-5-6-30.06460.01530.15620.02840.42820.27780.05240.0065
50-5-6-50.10940.05760.38130.29260.22450.06700.08580.0503
Hit rate11/3610/360/360/360/360/3625/3626/36
Table 6. The analysis outcome of the Spread.
Table 6. The analysis outcome of the Spread.
NSGA-IIJayaMOEA/DQ-MGCOA
AVGSTDAVGSTDAVGSTDAVGSTD
10-3-3-30.30130.09280.48080.07390.50190.02560.24820.0227
10-3-3-50.28560.03830.38690.07830.53200.03380.42570.0262
10-3-4-30.43610.07360.49800.05810.51950.05190.28320.0695
10-3-4-50.38000.09220.52570.06950.59370.03700.29940.0433
10-3-6-30.16670.03930.43260.08560.59900.04600.17660.0309
10-3-6-50.34500.10390.43330.06420.52410.07860.23220.0554
10-5-3-30.42270.08140.43460.04920.49290.03870.25420.0378
10-5-3-50.42750.10760.52950.08460.49580.07020.29290.0330
10-5-4-30.25010.05570.43250.11460.51310.01670.28330.0816
10-5-4-50.39340.06490.53420.05740.47900.09710.26330.0307
10-5-6-30.40200.10180.43870.06770.48740.10290.32100.0357
10-5-6-50.28420.03760.43730.04740.54290.04910.18060.0254
20-3-3-30.28440.09490.40160.03670.56960.04670.22040.0382
20-3-3-50.32480.04620.45920.01620.50760.07280.28040.0440
20-3-4-30.23390.04950.46030.06190.46890.06070.25660.0510
20-3-4-50.34810.04520.50970.10920.47270.05430.34890.0632
20-3-6-30.36710.10330.51950.06870.50730.06640.29590.0500
20-3-6-50.41530.05630.51190.09540.45660.05060.35980.0538
20-5-3-30.38860.09700.57970.07970.44510.05750.27140.0597
20-5-3-50.45100.05160.47070.08800.50260.09900.29670.0228
20-5-4-30.35280.04870.47110.06570.51220.10990.35820.0461
20-5-4-50.34580.08350.41490.04240.49650.09070.32210.0228
20-5-6-30.31600.05180.51130.10790.46290.11170.25460.0765
20-5-6-50.46700.05700.53140.04400.41670.03270.27050.0409
50-3-3-30.32650.06240.49360.06170.49000.06530.31630.0591
50-3-3-50.24690.03630.47000.07180.49700.12100.26150.0688
50-3-4-30.43030.03890.51990.09500.47320.03340.29240.0346
50-3-4-50.41730.08240.57230.04410.49130.06220.27650.0236
50-3-6-30.36190.06200.55380.06430.50940.03230.28420.0616
50-3-6-50.38460.04030.52930.04670.52550.05600.26950.0425
50-5-3-30.46190.11540.55270.09060.46370.04930.29520.0788
50-5-3-50.33180.05470.51870.08800.52100.06880.25250.0376
50-5-4-30.42400.06900.44570.05360.50730.08590.32740.0410
50-5-4-50.37320.07190.49470.06740.44900.09420.27700.0648
50-5-6-30.30670.03690.60150.03750.47150.06070.22680.0380
50-5-6-50.38640.04900.48280.10240.43640.05340.25170.0558
Hit rate7/367/360/362/360/366/3629/3621/36
Table 7. The lowest TEC (kwh) results.
Table 7. The lowest TEC (kwh) results.
TEC   ( k w h ) NSGA-IIJayaMOEA/DQ-MGCOA
10-5-3-3823847878758
20-5-3-31348144715961236
50-5-3-32908300433212745
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shi, H.; Si, H.; Qin, J. Energy-Efficient Scheduling for Resilient Container-Supply Hybrid Flow Shops Under Transportation Constraints and Stochastic Arrivals. J. Mar. Sci. Eng. 2025, 13, 1153. https://doi.org/10.3390/jmse13061153

AMA Style

Shi H, Si H, Qin J. Energy-Efficient Scheduling for Resilient Container-Supply Hybrid Flow Shops Under Transportation Constraints and Stochastic Arrivals. Journal of Marine Science and Engineering. 2025; 13(6):1153. https://doi.org/10.3390/jmse13061153

Chicago/Turabian Style

Shi, Huaixia, Huaqiang Si, and Jiyun Qin. 2025. "Energy-Efficient Scheduling for Resilient Container-Supply Hybrid Flow Shops Under Transportation Constraints and Stochastic Arrivals" Journal of Marine Science and Engineering 13, no. 6: 1153. https://doi.org/10.3390/jmse13061153

APA Style

Shi, H., Si, H., & Qin, J. (2025). Energy-Efficient Scheduling for Resilient Container-Supply Hybrid Flow Shops Under Transportation Constraints and Stochastic Arrivals. Journal of Marine Science and Engineering, 13(6), 1153. https://doi.org/10.3390/jmse13061153

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop