Next Article in Journal
Investigation of the Internal Solitary Wave Influence on Subsea Equipment Lowering with a Continuous Lowering Analysis Model
Previous Article in Journal
Simulation Application of Computational Fluid Dynamics for the Variable Structure Underwater Vehicle
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Energy-Efficient Scheduling for Distributed Hybrid Flowshop of Offshore Wind Blade Manufacturing Considering Limited Buffers

School of Logistics Engineering, Shanghai Maritime University, Shanghai 200135, China
*
Author to whom correspondence should be addressed.
J. Mar. Sci. Eng. 2025, 13(11), 2176; https://doi.org/10.3390/jmse13112176
Submission received: 21 October 2025 / Revised: 13 November 2025 / Accepted: 15 November 2025 / Published: 17 November 2025
(This article belongs to the Section Ocean Engineering)

Abstract

Amidst the backdrop of energy transition, scheduling problems in offshore manufacturing have emerged as critical challenges in marine engineering. However, the inherently coupled constraints of sequence-dependent setup times (SDST) and limited buffers (LB) have been largely overlooked. Therefore, this paper establishes the first multi-objective scheduling model, DHFSP-SDST&LB, specifically tailored for large components like turbine blades. A hybrid optimization algorithm, DDQN-MOCE, integrating an evolutionary algorithm (EA) and a double deep Q-network (DDQN), is proposed to overcome the inherent limitations of traditional MOEAs. In the EA component, a three-phase crossover and mutation policy is employed to generate offspring. In the DDQN component, the dimension-reduced feature vectors serve as the state input, and three makespan-oriented and two energy-oriented heuristic search actions are defined based on the knowledge. Finally, the optimal parameter combination is determined via Taguchi experimental design, and the effectiveness of DDQN-MOCE is evaluated on 36 instances and 1 industrial case. Experimental results demonstrate that DDQN-MOCE’s HV surpasses the second-best result by over 50% in 34 instances. It achieves the best GD, near-absolute dominance, and saves over 22% in total energy, with its high volume of solutions compensating for a minor weakness in spacing.

1. Introduction

In recent years, the global energy system has been transitioning toward a “low consumption, high output” trajectory. Offshore wind energy has become a crucial force in the green transition, owing to its efficiency and environmental friendliness [1]. As a vital component of the Ocean Engineering sector, the construction of offshore wind turbines represents a mainstream technological path for global collaborative decarbonization [2]. However, the manufacturing of offshore wind turbines, especially large-scale components, requires substantial input of resources and energy. Wind turbine blades are one of the core components, with the highest energy intensity and most complex manufacturing process, accounting for approximately 20% of the entire turbine’s cumulative energy demand [3]. Against the backdrop of the drive toward green and low-carbon manufacturing, how blade manufacturing can achieve the optimal balance between minimizing energy consumption and ensuring production capacity is a critical challenge that the current Ocean Engineering manufacturing sector urgently needs to address [4].
The manufacturing process of offshore wind turbine blades can be abstracted as a distributed hybrid flow shop scheduling problem (DHFSP). Blade manufacturing is a complex process sequentially involving stages such as layup, resin infusion, and mold closing. Some of these stages are equipped with multiple heterogeneous machines, which aligns with the hybrid flow shop (HFS) characteristics [5]. Concurrently, considering the blades’ dimensions and oversized transportation challenges, manufacturers generally adopt a multi-site distributed production model [6]. This multi-stage sequentiality coupled with multi-factory collaboration provides a strong academic basis for defining the problem as DHFSP. Solving DHFSP, which necessitates tasks of factory assignment, operation sequencing, and machine selection [7], typically aims to optimize key production metrics, such as the makespan [8], or multi-objective functions that incorporate energy consumption [9]. It combines the flexibility of distributed systems with the efficiency of flow shops, enabling energy reduction through methods like balancing machine workloads, and is commonly applied in fields such as marine transportation [10], hull structure manufacturing [11], and container logistics [12].
However, merely utilizing DHFSP to characterize the manufacturing process of offshore wind turbine blades is far from sufficient. In a real production environment, machines often need to perform adjustment operations when two adjacent jobs are processed on the same machine, such as switching resin systems, adjusting fiber layup angles, or resetting the curing temperature profile [13]. This time interval, during which processing is temporarily suspended and whose duration is dictated by the immediately adjacent job pair, is termed sequence-dependent setup time (SDST) [14]. Therefore, SDST must be incorporated into our model. Secondly, owing to limited space and storage costs, the temporary storage areas (buffers) cannot be expanded indefinitely [15]. Consequently, incorporating limited buffers (LBs) into our model is a more realistic choice. As early as 2008, researchers abstracted from a real television production environment a hybrid flow-shop scheduling problem that simultaneously considers SDST and LB [16]. Hakimzadeh, Sina, and Zandieh [17] extended this work, devising a scheduling scheme on printed-circuit-board (PCB) assembly lines. Recent literature, however, has mostly concentrated on the extreme case of no buffer space or has ignored the parallel machine environment and focused on single objectives. For example, Han et al. [18] devised a discrete evolutionary multi-objective optimization algorithm for the distributed blocking flow-shop scheduling problem with SDST, aiming to simultaneously minimize the makespan and total energy consumption. Zhao et al. [19] extended this problem by further introducing total tardiness as an additional objective. Zhao, Di, and Wang [20] integrated SDST, LB, and distributed shop scheduling, but they overlooked the parallel-machine environment. Focusing on the single-objective minimization of makespan, Zheng et al. [21] addressed the reentrant hybrid flowshop scheduling problem under the joint constraints of SDST and LB. However, the omission of complex constraints or simplification of the problem will only reduce the fidelity of the results to real-world scenarios and limit the scope of application. In particular, the critical constraint of zero buffer capacity means that the slack time of the scheduling system is entirely eliminated, as jobs remaining blocked force machines into mandatory idle time, which continuously consumes idle energy and compresses the energy saving potential [22]. The latest study directly related to our problem is presented by Müller et al. [23]. Under the same two constraints, they proposed a two-stage permutation flow-shop scheduling solution that optimized idle time and setup time. Therefore, inspired by the aforementioned studies, this paper abstracts the scheduling problem of offshore wind turbine blade manufacturing as a distributed hybrid flow shop scheduling problem constrained by sequence-dependent setup times and limited buffers (DHFSP-SDST&LB).
Currently, DHFSP has been proven to be NP-hard. DHFSP has been proven to be NP-hard. DHFSP-SDST&LB studied in this paper has higher complexity, increasing the difficulty of finding the optimal solution in a large solution space. In recent years, multi-objective evolutionary algorithms (MOEAs), which leverage their powerful global search capability, have been applied to solve energy-aware manufacturing scheduling problems. [24]. Among these, NSGA-II, MOEA/D, MOPSO, and SPEA2 are the most popular MOEA algorithms in this domain [25]. However, the performance of these meta-heuristic algorithms largely depends on the encoding strategy and operator design [26]. When solving strongly coupled problems, they heavily rely on repair operators, leading to premature convergence and becoming easily trapped in local optima [27]. In fact, pure MOEA algorithms need to be “enhanced” [28], typically by combining them with machine learning (such as reinforcement learning) to boost the algorithm’s dynamic decision-making and adaptive capabilities [29,30]. Among various reinforcement learning algorithms, Double Deep Q-Network (DDQN) separates the value function estimation from the policy update, effectively mitigating the overestimation issue of traditional DQN, thereby attracting significant attention [31]. More importantly, when dealing with complex discrete scheduling actions, compared to other policy gradient methods (such as PPO, A2C) which are better suited for continuous spaces, or other value-based algorithms, DDQN’s value function and learning mechanism can provide more stable and precise Q-value estimations, thus more directly guiding the scheduling agent to select the optimal energy-efficient strategy [32]. This is necessary to solve the DHFSP-SDST&LB. However, there are few studies dedicated to developing hybrid algorithms that combine Evolutionary Algorithm (EA) and DDQN to solve scheduling problems. Therefore, we can only identify the feasibility of integrating the two through the following representative studies: Zhang et al. [33] adopted EA as the underlying search mechanism, with Q-learning guiding EA to select the most appropriate heuristic at each search step. In the multi-objective evolutionary algorithm designed by Chen et al. [34], EA was responsible for global and local search, and Q-learning dynamically determined task-splitting strategies. Unlike the above approaches that refine EA for search efficiency, Deng, Di, and Wang [35] dedicate EA to scheduling objectives, employing Q-learning to adjust the importance vector of makespan and energy consumption to steer population evolution. Concurrently, innovative algorithms resulting from the fusion of DDQN and other meta-heuristics have demonstrated higher stability and convergence efficiency in complex environments, and they are now widely applied in flow shop scheduling [36,37]. This body of work establishes the academic foundation for combining the two. It is also realized that to better resolve the coupling of constraints and the paradox of goals, the algorithm cannot merely focus on balancing intensiveness and diversification but should also consider how to organically combine problem characteristics with algorithm execution. Therefore, we design knowledge-based local-search operators and link the non-dominated solution set between EA and DDQN, thereby mitigating the NP-hard difficulty and providing a scalable scheduling paradigm for the marine renewable energy equipment industry.
Table 1 highlights the classic literature discussed in this review, distinctly positioning the focus and value of the present study. Overall, current mainstream research predominantly concentrates on non-marine domains, such as general manufacturing. Consequently, noticeable deficiencies exist in the modeling and optimization tailored for the specific large-component production environment of offshore wind blade manufacturing. Specifically, some models lack critical coupled constraints found in real production scenarios, others exhibit insufficient multi-objective integration, and still others fail to incorporate advanced hybrid algorithmic mechanisms to mitigate the common drawback of traditional MOEAs. For accurate characterization, this paper abstracts the DHFSP-SDST&LB mathematical model, featuring strong coupling and multi-objective characteristics, from real production scenarios. To overcome the drawbacks of MOEAs, a DDQN-driven Multi-objective Evolutionary Algorithm (DDQN-MOCE) is proposed. Herein, the EA is employed for the global search space. DDQN leverages a multi-heuristic neighborhood structure to dynamically select five knowledge-based operators, which are then used to perform local search within the non-dominated solution set generated by the EA. Finally, DDQN exchanges information with the EA and thereby generates an elite solution set that achieves an efficient balance between makespan and total energy consumption. The main contributions in this paper are listed below:
  • This paper extends the DHFSP problem by establishing a multi-objective mixed-integer linear programming (MILP) model that simultaneously considers the strong coupling constraints of SDST and LB.
  • A hybrid optimization algorithm combining evolutionary algorithm and reinforcement learning is proposed. It utilizes the dynamic decision-making capability of DDQN to overcome the drawbacks of MOEAs and adaptively selects five knowledge-driven search operators.
  • Two critical path-based energy-saving strategies are introduced; they reduce the machine’s idle time through right-shift and speed-scaling mechanisms, thereby cutting down the total energy consumption.
The remaining contents are organized as follows. Section 2 presents the mathematical model for DHFSP-SDST&LB. The detailed design of DDQN-MOCE is shown in Section 3. Section 4 provides the numerical experiments and analyses. Finally, Section 5 concludes this study and outlines future research directions.

2. Materials and Methods

2.1. Problem Description

The DHFSP-SDST&LB is described as follows: A set of jobs J = {J1, J2, …, Jj, …, Jn} can be processed by the identical factories F = {F1, F2, …, Ff}, which are flow shops. Processing requires sequential passage through stages S = {S1, S2, …, Ss}. Buffers with capacity bsf exist between two consecutive stages. It can only temporarily accommodate bsf blade jobs. Each stage comprises mfs parallel machines of identical capability. Each machine operates at three speed levels: high, medium, and low. The actual processing time is negatively correlated with the speed level. For job j on machine i, the standard processing time is tijsf. If the machine operates at speed level l, the actual processing time is pijsfv = tijsf/vl. When job j leaves machine i and job j′ needs to be processed on machine i, a setup time sijsf occurs. STSDs are separate from processing times, and the former depends on the sequence of jobs. The actual processing energy consumption is positively correlated with the speed level. For job j on machine i, the energy consumption per unit time is uifv. If the machine operates at speed level l, the energy consumption per unit time is τifv = uifv · vl. A framework diagram of a flow shop scheduling system studied in this paper is shown in Figure 1. When a job arrives at one stage, it will be assigned to any idle and set-up machine at that stage for processing. Due to LB, the following situations may occur when jobs are processed:
  • If job j has already been processed at stage s − 1 and a machine m at stage s is free and prepared, job j is assigned to machine m and begins processing at stage s.
  • If job j has already been processed at stage s − 1 and no machine at stage s is free and prepared, but there are spaces in the buffer between stages s − 1 and s, then job j is allocated to the buffer in stage s − 1 and waits for one free and prepared machine m at stage s.
  • If job j has already been processed at stage s − 1, no machine at stage s is free and prepared, and there is no space in the buffer between stages s − 1 and s, then job j is blocked at the processing machine at stage s − 1 until one space becomes available in the buffer.
  • If job j is in situation 3 and the buffer is full, but a machine m at stage s becomes free and prepared, then job j is assigned to machine m and begins processing at stage s.
The assumptions of DHFSP-SDST&LB are indicated as follows. It is noteworthy that blade manufacturing adopts a fixed-position manufacturing mode, where internal transport distances are minimal. Furthermore, machine on/off operations and the setup energy consumption associated with SDST, such as the energy required for tool preheating, are typically far less than the subsequent hours of processing energy consumption, including the energy consumed during high-temperature curing. Therefore, in terms of magnitude, only processing and idle energy consumption are considered.
  • All jobs are available at time zero, and there is no precedence constraint for the jobs in any factory.
  • Machine breakdowns, order insertions, and other dynamic disturbances are not taken into consideration.
  • Job transportation times and machine speed changeover times are ignored.
  • Processing energy consumption is positively correlated with speed level; the higher the speed, the shorter the processing time.
  • Only processing and idle energy consumptions are considered; machine start/stop, setup, and jobs transport energy consumptions are ignored.
  • When a job is blocked on a machine, the machine incurs idle energy consumption.
  • Once a machine starts processing a job, interruption is not allowed.
  • After a job has been assigned to one factory and machine, it cannot be transferred.
  • Each job cannot be processed on more than one machine simultaneously, and each machine can process at most one job at a time.

2.2. MILP Model

The notations and constraints of DHFSP-SDST&LB are described in Table 2.
The MILP model for DHFSP-SDST&LB is presented as follows:
M i n i m i z e   c max = max j J , s S , f F , i M f s c i j s f
M i n i m i z e   t e c = j J , s S , f F , v L , i M f s τ i f v p i j s f v x i j s f v + s S , f F , i M f s φ i f s c max j J , v L p i j s f v x i j s f v
Subject to:
f F y j f = 1 ,   j J
v L , i M f s x i j s f v = y j f ,   j J , s S , f F
j J , v L x i j s f v 1 ,   i M f s , s S , f F
c i j s f s i j s f + p i j s f v x i j s f v M 1 x i j s f v ,   j J , s S , f F , v L , i M f s
s i j s f c i j s f + s t i j j s f M 1 z i j j s f ,   j , j J , j j , s S , f F , i M f s
j J w j s f b s f ,   s S , f F
c i j s 1 f s i j s f + M 1 x i j s f v ,   j J , s > 1 , f F , i M f s
w j s f i M f s x i j s f v ,   j J , s S , f F
s i j s f c i j s f + s t i j j s f M 2 x i j s f v x i j s f v ,   j , j J , j j , s S , f F , i M f s
c max c i j s f ,   j J , s S , f F , i M f s
z i j j s f x i j s f v x i j s f v ,   j , j J , j j , s S , f F , i M f s
y j f , x i j s f v , z i j j s f , w j s f 0 , 1 ,   j J , s S , f F , v L , i M f s
c max , t e c , c i j s f , s i j s f 0 ,   j J , s S , f F , i M f s
Objective Functions (1) and (2) indicate that the objectives are minimizing the maximum completion time and the total energy consumption, including processing energy consumption and idle energy consumption. Constraint (3) ensures that a job can be assigned to only one factory. Constraint (4) guarantees that each job is only assigned to one machine and is processed at one speed. Constraint (5) restricts that a machine can process at most one job at a time. Constraint (6) describes the relationship between sijsf and cijsf. Constraint (7) provides a time limit for adjacent jobs. Constraint (8) is a buffer capacity constraint, where the total number of jobs waiting in the buffer between stages s and s+1 does not exceed its capacity limit. Constraint (8) is a buffer capacity constraint. Constraint (9) ensures that a job can begin the current stage of processing or enter the buffer only after completing the previous stage of processing. Constraint (10) guarantees that job j is allowed to use the buffer of that stage only if it is assigned to a machine in stage s. Constraint (11) prevents overlapping processing times for jobs on the same machine while accounting for sequence-dependent setup times. Constraint (12) defines the maximum completion time. Constraints (13)–(15) limit the variables.

3. Methodology

In order to solve DHFSP-SDST&LB of blade manufacturing, Section 3 presents the proposed DDQN-MOCE and details its components.

3.1. Framework of DDQN-MOCE

As shown in Figure 2, the algorithm mainly includes encoding, initialization, global search, decoding and updating solution set, local search driven by co-evolution mechanism, and agent training. Firstly, a three-dimensional encoding scheme for jobs, factories, and speeds is devised, and an initial population is generated at random. With the help of the EA’s global search and its crossover and mutation mechanisms, a Pareto set is maintained and updated. Based on the critical path and critical operations, the local-search operators encompass fundamental perturbation operators related to jobs, factories, and speeds for minimizing the makespan, as well as critical-path-guided operators specifically aimed at reducing tec. During the local search phase, the state is expressed as a one-dimensional vector after dimensionality reduction. DDQN, based on its learned Q-policy, dynamically selects the aforementioned heuristic operators and calculates the reward. The agent is then trained based on historical experience.

3.2. Encoding and Decoding

For the encoding scheme, a three-level encoding method is employed, which generates the job sequence vector (JS), the factory assignment vector (FS), and the speed selection vector (SS). The length of the JS and FS vectors is n, whereas the SS is the product of the number of jobs and the number of stages. This design ensures that the encoded solution itself is inherently feasible. A job can only be assigned to a factory after confirming that both parallel machines and buffers at all processing stages of the current factory are available. Therefore, the feasibility of the solution is embedded into the chromosome via the LB constraint during the encoding stage, thus preventing constraint conflicts during the decoding process.
For the decoding scheme, in order to convert the encoded solution into a feasible schedule, the job subsets assigned to each factory are first identified by aligning JS and FS. At the first processing stage, a randomly selected machine processes the job at the speed dictated by the SS. In all subsequent stages, machines are successively seized by jobs following JS, with speeds set by SS. Finally, cmax and tec are calculated with the help of functions (1) and (2).

3.3. Global Searching

After randomly generating an initial population that matches the population size, DDQN-MOCE applies crossover and mutation operations to the parents of JS, FS, and VS to generate a new population. The crossover steps are as follows: (1) Two crossover points called cp1 and cp2 are randomly generated. (2) For JS, the gene sequence from the start to cp1 in parent1 is selected, and the gene sequence from (cp1)+1 to the end in parent2 is selected. These two sequences are combined to generate the offspring. (3) For FS, the gene sequence from the start to cp2 in parent1 is selected, and the gene sequence from cp2+1 to the end in parent2 is selected; these two sequences are combined to generate the offspring. (4) For VS, values are alternately selected to generate the offspring; that is, the speed from parent1 is chosen for even positions, and the speed from parent2 is chosen for odd positions.
To enhance population diversity, a mutation strategy is employed, with the steps as follows: (1) For JS, two positions are randomly selected, and their order is swapped. (2) For FS, one factory is randomly selected, and if the corresponding job can be assigned to other factories, a factory other than the original one is randomly chosen. (3) For VS, one job and one stage are randomly selected, and the speed is mutated to another available speed other than the original one.
The resultant offspring, in conjunction with their parent solutions, partake in the solution set update process. Finally, through non-dominance assessment, incremental update mechanism, and crowding distance-based pruning, the non-dominated solution set is maintained, thereby preparing for the subsequent local search.

3.4. Energy-Efficient Strategy

The model in this paper takes into account processing energy consumption and idle energy consumption. Hence, the energy-efficient strategy is designed around speed adjustment and idle time reduction.
Drawing on the prior knowledge presented in [38], the critical path (Pc) and critical operation (Oc) are defined. For DHFSP-SDST&LB, the critical path is the longest continuous job path of a solution with no idle time. Each operation on the critical path is termed a critical operation (Oc). As shown in Figure 3, the critical path appears in factory0, where a decrease in the processing speed of any job on this path results in an increase in the makespan and a reduction in energy consumption. Therefore, while keeping Oc completely unchanged, delaying the start processing time of noncritical operations Of as much as possible or reducing the processing speed of Of as much as possible can lead to lower energy consumption. Based on this, two strategies are designed, namely EM1 and EM2. Algorithms 1 and 2 present the corresponding pseudocodes.
  • EM1: Identify Pc and Oc, collect all Of, calculate the maximum allowable delay time for Of, and delay the start processing time of Of as late as possible without increasing cmax.
  • EM2: Identify Pc and Oc, collect all Of, obtain all available speed options lower than the current speed of the machine processing Of, and select the slowest available speed for the machine without increasing cmax.
Algorithm 1: Energy-Efficient Strategy EM1
Input:
current solution Sol
decoded schedule schedule
makespan cmax
Output: updated solution Solnew
1: Pc ← IdentifyCriticalPath(schedule, cmax) // Identify the critical path
2: Oc ← [operations in Pc] // Collect all critical operations from the critical path
3: Of ← [operations in schedule but not in Oc] // Collect all non-critical operations
4: Sort Of by finish time in descending order
5: for each operation o in Of do
6:  tstart ← o.Start
7:  tfinish ← o.Finish
8:  tproc ← tfinish − tstart
9:  delaymax ← tproc
10:  tnew_start ← tstart + delaymax
11:  tnew_finish ← tnew_start + tproc
12:  if tnew_finish ≤ cmax then
13:   o.Start ← tnew_start
14:   o.Finish ← tnew_finish
15:  end if
16: end for
17: return Solnew
Algorithm 2: Energy-Efficient Strategy EM2
Input:
current solution Sol
decoded schedule schedule
makespan cmax
set of available speed levels V
Output: updated solution Solnew
1: Pc ← IdentifyCriticalPath(schedule, cmax) // Identify the critical path
2: Oc ← [operations in Pc] // Collect all critical operations from the critical path
3: Of ← [operations in schedule but not in Oc] // Collect all non-critical operations
4: for each operation o in Of do
5:  j ← o.Job
6:  s ← o.Stage
7:  vcurrent ← Sol.speed_selection[j][s]
8:  Vslower ← {v ∈ V | v < vcurrent}
9:  if Vslower ≠ ∅ then
10:   vnew ← max(Vslower)
11:   tproc_new ← GetProcessingTime(o.Factory, o.Stage, o.Machine, j, vnew) // Calculate new processing time
12:   if o.Start + tproc_new ≤ cmax then
13:    Sol.speed_selection[j][s] ← vnew // Update speed in the solution
14:   end if
15:  end if
16: end for
17: return Solnew

3.5. Local-Search Operators

The local-search operators developed in this paper focus on Oc, cmax, and tec, and they comprise five problem features-based operators. Among them, LS1–LS3 are basic perturbation operators that prioritize cmax reduction, and ES1 and ES2 are critical-path-guided operators, detailed in Section 3.4, that are dedicated to lowering tec.
  • LS1: Randomly choose a job sequence and two positions within it. Swap the corresponding jobs to obtain a new processing order.
  • LS2: Randomly choose a machine. Exclude its current speed and then randomly assign one of the remaining available speed levels.
  • LS3: Randomly choose a factory. Exclude its current factory and then randomly assign one of the remaining eligible factories.
  • ES1: Execute EM1 to delay all Ofa as late as possible.
  • ES2: Execute EM2 to set the slowest feasible speed for all Ofs.

3.6. DDQN-Based Local Search

The double deep Q-network plays a crucial role in dynamically selecting the most suitable operator combination. Once EA completes global search and passes the information to the agent, DDQN can then, based on the Q-policy from which it had learnt the current scheduling state, precisely invoke the optimal operator. The model based on DDQN is defined as follows:
Neural network module: We design a four-layer fully connected structure. The input layer receives feature vectors from the Pareto solutions, then it extracts higher-order features step by step through three hidden layers (256 → 128 → 64), each followed by a ReLU activation function to enhance non-linear expression ability. Finally, the output layer generates a five-dimensional Q-value vector corresponding to the expected cumulative rewards of the five search operators. The neural network uses forward propagation for accurate mapping from factory states to operator values.
  • State space: The state space is composed of three dimensions, namely job sequence, factory assignment, and machine selection. For the job sequence dimension, the JS vector is normalized and mapped into the range [0, 1]. For the factory assignment dimension, the FS vector is transformed into a one-hot encoding. For the machine selection dimension, the SS matrix is first flattened by reshaping it into a one-dimensional vector row-wise and then globally normalized.
  • Action space: The discrete action space is defined as action ∈ {0, 1, 2, 3, 4}, corresponding to preset heuristic operators LS1–ES2. An ε-greedy policy is adopted. If the random number is below ε, an operator is chosen at random. Otherwise, the state vector is tensorized and forwarded to the evaluation network, and the operator yielding the maximum Q-value is selected.
  • Reward function: To reconcile the dual objectives of minimizing cmax and tec, a hierarchical reward function with embedded priorities is devised. When both cmax and tec are improved, the reward is 20; when only cmax is improved, the reward is 15; when only tec is improved, the reward is 10; and when neither is improved, the reward is 0.
  • Network training: When the number of stored samples is sufficient, the neural network begins training. For every NU step of training, the parameters θ1 of the evaluation network QE are fully aligned with the parameters θ2 of the target network QT. Subsequently, bs experience samples are randomly sampled from the replay pool, from which the state St, action At, reward Rt, and next state St+1 are extracted. QE predicts the Q-value of St based on the Q-value of the current state–action pair and all Q-values of St+1, thereby selecting action At+1. On the other hand, QE calculates the Q-value corresponding to At+1 and updates the Q-value. Finally, based on the learning rate α, the parameters of QE are updated using the Adam optimizer.
The training of the DDQN agent proceeds simultaneously with the iteration of EA, operating as an online training mechanism. Therefore, we evaluate the convergence of DDQN-MOCE by observing the stability of the average reward and Pareto performance metrics over consecutive generations. Preliminary experiments showed that when the training reached approximately 150 generations, the average reward and metrics such as HV, spacing, and GD had essentially entered a statistical steady state, with no significant growth observed in subsequent iterations. Consequently, we set 200 iterations as the final convergence criterion. The pseudocode is shown in Algorithm 3.
Algorithm 3: Training process of D2QN
Input:
QE(θ1): evaluated Q-network with parameters θ1
QT(θ2): target Q-network with parameters θ2
E: replay buffer (deque)
batch_size: number of samples per batch (bs)
learning_rate: optimizer learning rate (α)
discount_factor: reward discount γ
update_target_every: steps interval to update target network (NU)
current_step: total training step counter
Epoch: number of optimizations passes
Output: Updated QE(θ1) and QT(θ2)
1: if len(E) < bs then
2:    return // not enough samples to train
3: if current_step mod NU == 0 then
4:  θ2 ← θ1 // update target network
5: for epoch in range (Epoch) do
6:  T ← random sample of size bs from E
7:  Extract (St, At, Rt, St+1) from T // all values are tensors
8:  q_eval ← QE(St)[At] // gather Q-value of selected action
9:  q_next_eval ← QE(St+1) // all Q-values for next state from QE
10:  At+1 ← argmax(q_next_eval) // select best next action
11:  q_next_target ← QT(St+1)[At+1] // get target Q-value from QT
12:  q_target ← Rt + γ * q_next_target // compute target Q-value
13:  L ← MSE(q_eval, q_target) // mean squared error loss
14:  Adam (QE(θ1), α, L). // Update QE(θ1) using Adam optimizer with loss L and learning rate α

4. Results and Discussion

Section 4 presents a comprehensive experimental campaign for DDQN-MOCE, including parameter analysis, component separation, and comparison experiments. All experiments are built on Python 3.11 and rely on PyTorch 2.1.0 as the core computational engine.

4.1. Instances and Metrics

As DHFSP-SDST&LB is a newly introduced problem for which no open benchmark tests are currently available, the data design range was determined based on practical investigation of wind turbine blade manufacturing enterprises, leading to the ultimate selection of 36 simulation instances of varying scales. We set the number of jobs n ∈ {20, 50, 100, 200}, the number of factories f ∈ {3, 4, 5}, the number of stages s ∈ {2, 5, 8}, and the number of parallel machines mfs ∈ {1, 2, 3}. Standard processing times tijsf follow a continuous uniform distribution on [1, 50], setup times are uniformly distributed integers in [1, 5], and buffer capacities are randomly selected from {1, 2, 3, 4, 5}. The standard processing energy consumptions per time uifv are uniformly distributed in [1.0, 6.0], and the idle energy consumptions per unit time φifs are uniformly distributed in [0.5, 2.5]. These settings are all abstracted from real-world wind turbine blade production, and the number of stages as well as the machines can be appropriately merged or subdivided according to specific blade types. Across all combinations of job count, plant count, and processing stage, 4 × 3 × 3 = 36 instance sets are generated, with mfs randomly set by the code. Hypervolume (HV), spacing, generational distance (GD), and coverage (C-metric) are utilized to evaluate each algorithm. When the HV value is larger, the comprehensive performance is better. Lower spacing and GD indicates better convergence and diversity. The C-metric measures the proportion of solutions in set A dominated by solutions in set B.

4.2. Parameter Settings

DDQN-MOCE includes six parameters, which are population size Ps, mutation rate Pm, crossover rate Pc, learning rate α, discount factor γ, and exploration rate ε. A design-of-experiment (DOE) Taguchi method [39] is adopted to determine the optimal parameter settings. Each parameter is assigned three distinct levels, and an orthogonal array L18(36) is designed. Moreover, the parameter levels are given as follows: Ps = {30, 80, 100}, Pm = {0.1, 0.2, 0.3}, Pc = {0.8, 0.9, 1.0}, α = {0.001, 0.01, 0.1}, γ = {0.85, 0.9, 0.95}, ε = {0.8, 0.85, 0.9}.
The experiment is conducted on the medium-scale instance 4–4–100 (4 factories, 4 stages, 100 jobs), with all other settings identical to Section 4.1. Every parameter combination independently executes 20 times, with each run limited to 250 iterations. HV is employed as the evaluation criterion. Figure 4 shows the main effects plot of all parameters. It can be observed that the optimal parameter configuration is Ps = 30, Pm = 0.3, Pc = 1.0, α = 0.01, γ = 0.85, ε = 0.85. In addition to determining the core hyperparameters through the Taguchi experiment, DDQN requires setting several auxiliary parameters for ensuring stable training. Following the common practices in deep learning for scheduling and a preliminary sensitivity analysis, we adopted the following settings: learning rate (α) = 0.01; replay-memory size = 10,000; batch size = 64; discount factor (γ) = 0.85; exploration rate (ε) = 0.85; target-update time interval = 100.

4.3. Effectiveness of the Algorithm Components

Dedicated to evaluating the contributions of DDQN and five local-search operators within DDQN-MOCE, this section introduces two variants: D1 and D2. D1 means DDQN-MOCE without DDQN dynamically selecting search operators. D2 means that during local search, DDQN can only select LS1, LS2, and LS3. D1, D2, and DDQNMOCE adopt identical termination criteria. The 36 instances from Section 4.1 are executed with 20 independent runs each, capped at 200 iterations. Table 2 lists each algorithm’s statistical results for HV, spacing, and GD. The average C-metric values are listed in Table 3. The problem scale is labeled in the form of “factory–stage–job”. Bold values with a gray background highlight the superior performance.
Table 3 shows that, across most instances, DDQN-MOCE demonstrates the best performance, achieving the highest mean HV (32/36), the lowest mean Spacing (22/36), and the mean lowest GD (34/36). Compared to DDQN-MOCE, D1 records a significantly lower HV and shows no advantage in GD. Although D1 yields slightly lower spacing than DDQN-MOCE in five instances, its overall solution quality remains poor. This confirms that, in the absence of DDQN, the algorithm cannot target high-yield operators according to the convergence state of the current solution set. Compared to D1, D2 achieves higher HV and lower GD across all instances, validating the effectiveness of the three basic perturbation operators focus on cmax optimization. Compared to D2, DDQN-MOCE outperforms in HV and GD in most cases. This is due to the absence of ES1 and ES2, which prevents its solution from fully aligning with the objective of tec minimization. Although the advantage of D2 in spacing is mainly observed in larger-scale instances, the energy-efficient strategy remains effective from a global perspective.
As shown in Table 4, DDQN-MOCE outperforms both D1 and D2 in 31 instances. Specifically, C(DDQN-MOCE, D1) is significantly higher than C(D1, DDQN-MOCE). In other words, DDQN-MOCE almost completely dominates D1. DDQN-MOCE is surpassed by D2 in five large-scale instances. However, overall, DDQN-MOCE still remains the best and most robust solution. Therefore, on average, both DDQN’s local search mechanism and the design of the five operators contribute to the construction of DDQN-MOCE.
To further establish the statistical significance of the effectiveness of the DDQN mechanism and energy-saving strategies, we conducted the Friedman non-parametric test on the HV, spacing, and GD results of D1, D2, and DDQN-MOCE, as shown in Table 5. The results showed that all χ 2 values are high. And, the p-values for all the metrics are significantly below the 0.05 threshold, confirming a highly significant difference in performance among the three algorithms. The absolute superiority of DDQN-MOCE’s rankings powerfully validates that the DDQN mechanism is indispensable for enabling the algorithm to focus on high-yield operators and substantially accelerate convergence. Furthermore, its outperformance of D2 conclusively demonstrates that the designated energy-saving strategies, EM1 and EM2, are essential for overcoming the intrinsic limitations of basic operators in achieving optimal tec optimization.

4.4. Comparisons to Other Algorithms

After validating the contribution of each component, we compare DDQN-MOCE with the state-of-the-art MOEAs, including NSGA-II [24], MOEA/D [40], MOPSO [41], and SPEA2 [42], to further examine the overall performance of DDQN-MOCE. All are well suited to multi-objective optimization, and their parameters are set exactly as in the original papers. Consistent with the experimental design in Section 4.3, all algorithms are tested on the same 36 instances. And, each runs independently 20 times with a maximum of 200 iterations in the same operating environment. Table 3, Table 4 and Table 5 present the statistical results of HV, spacing, and GD for all algorithms. The notations “−/+” indicate that the compared algorithm is worse/better than DDQN-MOCE, while “≈” denotes no significant difference between them. Table 6 summarizes the dominance relationships among the algorithms. C(NSGA-II, DDQN-MOCE) is abbreviated as A, C(NSGA-II, DDQN-MOCE) as A’, with B, C, and D denoting C(DDQN-MOCE, MOEA/D), C(DDQN-MOCE, MOPSO), and C(DDQN-MOCE, SPEA2), respectively. The best values are highlighted in bold.
As shown in Table 6, the HV value of DDQN-MOCE consistently exceeds that of the other algorithms. Statistically, it surpasses the second-place HV value by more than 50% in 34 instances, with a multiple difference seen in 26 instances. This demonstrates that the algorithm designed in this paper achieves the best convergence in solution quality. It not only generates solutions closer to the true Pareto front but also occupies a larger volume in the objective space. As for the spacing metric in Table 7, DDQN-MOCE performs slightly worse. It significantly outperforms MOPSO in 21 instances and shows no substantial difference from SPEA2. Although NSGA-II and MOEA/D record more best values, they surpass DDQN-MOCE in only 15 instances. This is attributed to DDQN-MOCE discovering several times as many solutions as the other algorithms, together with a wider extreme-value span. This makes it easier to have uneven distribution. However, as instance size grows, DDQN-MOCE’s performance steadily improves, demonstrating its scalability. On average, the proposed algorithm is only marginally behind NSGA-II and MOEA/D. As shown in Table 8, DDQN-MOCE attains GD values that are markedly smaller than those of all competitors. This reaffirms the marked superiority of our algorithm. Table 9 reveals that DDQN-MOCE consistently delivers superior results and exhibits absolute dominance over the remaining competitors, underscoring its superior convergence and solution quality.
To further evaluate the algorithm’s performance, three instances are selected: the small-scale 5-5-20, the medium-scale 4-8-50, and the large-scale 5-8-100, where DDQN-MOCE performs relatively weakly on spacing. By observing the Pareto fronts in Figure 5, Figure 6 and Figure 7 and the corresponding data in Table 10, Table 11 and Table 12, we can summarize that DDQNMOCE produces the largest non-dominated set. The frontier is located at the extreme lower-left corner, extends the longest span along both cmax and tec dimensions, and its leftmost point is markedly lower than those of the competitors. Thus, the C-metric, HV, and GD results mirror the earlier statistics. After observing the frontiers of the single algorithm, it was found that NSGA-II and MOEA/D maintain no obvious gaps or local clustering, while DDQN-MOCE alternates in density and is uneven in distribution. A pronounced sparsity appears in the cmax interval [2600, 4700] of the 5-8-100 instance. All of these provide evidence for the weaker performance under the spacing metric.
Finally, similar to the ablation study, we conducted the Friedman non-parametric test on the performance of the five algorithms to ensure the statistical reliability of the comparative results. As indicated in Table 13, the extremely high χ 2 (from 88.89 to 124.42) confirms the stability of DDQN-MOCE’s superior performance, with p-values much smaller than 0.05 (from 2.27 × 10−18 to 6.07 × 10−26) signifying a statistically significant difference in performance between DDQN-MOCE and its competitors. However, while our designed algorithm presents a slight relative weakness (rank 3.69) in the spacing metric, its concomitant overwhelming advantage in both HV and GD reflects that the algorithm achieves a greater coverage area and extreme span within the objective space.

4.5. Comparisons on a Real-World Case

For a comprehensive and rigorous evaluation of the performance superiority, practical feasibility, and model generalizability of DDQN-MOCE, we conducted an on-site investigation of a medium-sized offshore wind turbine blade manufacturing enterprise and obtained the following information: The enterprise’s quarterly demand is 30 sets of wind turbine blades, utilizing 3 distributed bases. The production process is simplified into five stages (layup, infusion, curing, assembly, and fine finishing). Crucially, the machine configurations are heterogeneous, with parallel machine counts being {3, 1, 3, 2, 2}, {2, 1, 2, 1, 1}, and {2, 1, 1, 1, 1}. The single-stage processing time is 2–10 h, SDST is 0.5–1.5 h, the buffer accommodates a maximum of three intermediate products, the unit processing energy consumption is 5–15 KW, and the unit idle energy consumption is 2–5 KW.
All algorithms were run independently 20 times, with other settings identical to those in Section 4.4. The result performances are presented in Table 14 and Table 15. In the C-metric table, the value in row a and column a represents the dominance rate of algorithm a over algorithm b. DDQN-MOCE consistently demonstrates substantial superiority over the other algorithms in HV, GD, and C-metric, all of which are approximately twice those of the others. While its spacing metric is slightly inferior, ranking fourth, this remains within the expected range. Although the average running time of DDQN-MOCE is longer than that of SPEA2, it brings a huge performance gain of nearly 90% in HV. Therefore, based on this cost–benefit analysis, we consider this highly acceptable in a real-world manufacturing environment. Furthermore, during the experiment, we compared scheduling points on the Pareto front that had comparable makespan values to those of competitors, and the statistics revealed potential energy savings of approximately 22% to 31%. Following the calculation approach based on the Pareto chart provided in ref. [34], we find that the algorithm designed in that study reduces energy consumption by approximately 10–28%. The energy improvement in [36] was about 10%. Although the research backgrounds, constraints, and algorithms are not entirely identical, this comparison from this perspective nonetheless shows that our algorithm is reasonable and capable of providing new insights for existing research.
In summary, DDQN-MOCE significantly outperforms NSGA-II, MOEA/D, MOPSO, and SPEA2. This is because while expanding the global search scope, DDQN-MOCE selects the search operator best suited to the current scheduling environment. The proposed algorithm incorporates specific knowledge, closely meeting the requirements of problems. Consequently, DDQN-MOCE effectively addresses the DHFSP-SDST&LB, serving as an efficient solution to the blade-production scheduling problem.

5. Conclusions

This paper investigates DHFSP with SDST and LB (DHFSP-SDST&LB) within the context of offshore wind turbine blade manufacturing. Initially, a Multi-Objective MILP model is constructed to simultaneously minimize makespan and total energy consumption. Subsequently, a DDQN-driven Multi-Objective Evolutionary Algorithm (DDQN-MOCE) is proposed. This study is the first to integrate large marine engineering components like blades into a multi-constrained DHFSP framework. By combining the dynamic decision-making capability of DDQN with the global search ability of EA, we overcame the limitations of traditional MOEAs easily becoming trapped in local optima. Furthermore, the design includes five critical path-based heuristic search operators and two energy-saving strategies, ensuring a robust balance between cmax and tec. All 36 simulation instances and a real-world case study consistently demonstrate the superior overall performance of DDQN-MOCE. It surpasses the second-place HV value by more than 50% in 34 instances, with a multiple difference seen in 26 instances. And, GD values are markedly smaller than those of all competitors. Although DDQN-MOCE performs slightly worse on the Spacing metric, it still significantly outperforms MOPSO in 21 instances and shows no substantial difference from SPEA2. DDQN-MOCE consistently delivers superior results and exhibits absolute dominance over the remaining competitors, where C(DDQN-MOCE, Competitor) often approaches 1. When compared with the results of other studies (for example, the energy-saving benefits in [34,36] are 10% and above), our potential energy-saving benefit of over 22% also demonstrates a significant advantage, proving the scientific add-on, rationality, and value of this research.
Utilizing a combination of simulated and actual wind blade manufacturing data, this study proves DDQN-MOCE’s excellent scalability and potential for integration within manufacturing execution systems. Furthermore, with adjustments to domain-specific knowledge, this framework is transferable to other industrial scenarios involving SDST, LB constraints, or variable processing speeds, such as semiconductor manufacturing and large mold production. However, the current model simplifies the energy framework by omitting transportation, machine on/off, and setup energy consumption, which are tangible elements in real production. Therefore, our future research will investigate whether these omitted losses are critical factors influencing scheduling decisions and energy consumption in actual industrial applications. We will also develop methodologies to more precisely assess multi-objective improvements, such as a fixed-baseline approach for quantifying makespan and energy-saving percentages. Advanced and problem-specific genetic operators will also become part of our future research, including systematically investigating the impact of non-standard crossover and mutation strategies. Additionally, exploring lightweight learning frameworks can be attempted to further enhance solution efficiency.

Author Contributions

Methodology, Q.Z. (Qinglei Zhang) and Q.Z. (Qianyuan Zhang); Validation, Q.Z. (Qinglei Zhang) and Q.Z. (Qianyuan Zhang); Formal analysis, Q.Z. (Qianyuan Zhang); Resources, Q.Z. (Qianyuan Zhang); Data curation, Q.Z. (Qinglei Zhang) and Q.Z. (Qianyuan Zhang); Writing—original draft preparation, Q.Z. (Qianyuan Zhang); Writing—review and editing, J.Q. and J.D.; Visualization, Y.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Su, X.; Wang, X.; Xu, W.; Yuan, L.; Xiong, C.; Chen, J. Offshore Wind Power: Progress of the Edge Tool, Which Can Promote Sustainable Energy Development. Sustainability 2024, 16, 7810. [Google Scholar] [CrossRef]
  2. Kim, A.; Kim, H.; Choe, C.; Lim, H. Feasibility of offshore wind turbines for linkage with onshore green hydrogen demands: A comparative economic analysis. Energy Convers. Manag. 2023, 277, 116662. [Google Scholar] [CrossRef]
  3. Yang, J.; Chang, Y.; Zhang, L.; Hao, Y.; Yan, Q.; Wang, C. The life-cycle energy and environmental emissions of a typical offshore wind farm in China. J. Clean. Prod. 2018, 180, 316–324. [Google Scholar] [CrossRef]
  4. Farina, A.; Anctil, A. Material consumption and environmental impact of wind turbines in the USA and globally. Resour. Conserv. Recycl. 2022, 176, 105938. [Google Scholar] [CrossRef]
  5. Neufeld, J.S.; Schulz, S.; Buscher, U. A systematic review of multi-objective hybrid flow shop scheduling. Eur. J. Oper. Res. 2023, 309, 1–23. [Google Scholar] [CrossRef]
  6. Levine, A.; Cook, J. Transportation of Large Wind Components: A Permitting and Regulatory Review; National Renewable Energy Lab. (NREL): Golden, CO, USA, 2016. [Google Scholar] [CrossRef]
  7. Shao, W.; Shao, Z.; Pi, D. Multi-local search-based general variable neighborhood search for distributed flow shop scheduling in heterogeneous multi-factories. Appl. Soft Comput. 2022, 125, 109138. [Google Scholar] [CrossRef]
  8. Yang, Y.; Li, X. A knowledge-driven constructive heuristic algorithm for the distributed assembly blocking flow shop scheduling problem. Expert Syst. Appl. 2022, 202, 117269. [Google Scholar] [CrossRef]
  9. Niu, W.; Li, J.-q.; Jin, H.; Qi, R.; Sang, H.-y. Bi-objective optimization using an improved NSGA-II for energy-efficient scheduling of a distributed assembly blocking flowshop. Eng. Optim. 2022, 55, 719–740. [Google Scholar] [CrossRef]
  10. Shi, H.; Si, H.; Qin, J. Energy-Efficient Scheduling for Resilient Container-Supply Hybrid Flow Shops Under Transportation Constraints and Stochastic Arrivals. J. Mar. Sci. Eng. 2025, 13, 1153. [Google Scholar] [CrossRef]
  11. Li, J.; Lin, P.; Wu, X.; Song, D.; Yang, B.; Zhou, L. Scheduling optimization of ship plane block flow line considering dual resource constraints. Sci. Rep. 2024, 14, 30765. [Google Scholar] [CrossRef]
  12. Zhong, Z.; Guo, Y.; Zhang, J.; Yang, S. Energy-aware Integrated Scheduling for Container Terminals with Conflict-free AGVs. J. Syst. Sci. Syst. Eng. 2023, 32, 413–443. [Google Scholar] [CrossRef]
  13. Njiri, J.G.; Söffker, D. State-of-the-art in wind turbine control: Trends and challenges. Renew. Sustain. Energy Rev. 2016, 60, 377–393. [Google Scholar] [CrossRef]
  14. Wang, Y.; Li, X.; Ma, Z. A Hybrid Local Search Algorithm for the Sequence Dependent Setup Times Flowshop Scheduling Problem with Makespan Criterion. Sustainability 2017, 9, 2318. [Google Scholar] [CrossRef]
  15. Clarke, J.; McIlhagger, A.; Archer, E.; Dooher, T.; Flanagan, T.; Schubel, P. A Feature-Based Cost Estimation Model for Wind Turbine Blade Spar Caps. Appl. Syst. Innov. 2020, 3, 17. [Google Scholar] [CrossRef]
  16. Yaurima, V.; Burtseva, L.; Tchernykh, A. Hybrid flowshop with unrelated machines, sequence-dependent setup time, availability constraints and limited buffers. Comput. Ind. Eng. 2009, 56, 1452–1463. [Google Scholar] [CrossRef]
  17. Hakimzadeh Abyaneh, S.; Zandieh, M. Bi-objective hybrid flow shop scheduling with sequence-dependent setup times and limited buffers. Int. J. Adv. Manuf. Technol. 2011, 58, 309–325. [Google Scholar] [CrossRef]
  18. Han, Y.; Li, J.; Sang, H.; Liu, Y.; Gao, K.; Pan, Q. Discrete evolutionary multi-objective optimization for energy-efficient blocking flow shop scheduling with setup time. Appl. Soft Comput. 2020, 93, 106343. [Google Scholar] [CrossRef]
  19. Zhao, F.; Xu, Z.; Bao, H.; Xu, T.; Zhu, N.; Jonrinaldi. A cooperative whale optimization algorithm for energy-efficient scheduling of the distributed blocking flow-shop with sequence-dependent setup time. Comput. Ind. Eng. 2023, 178, 109082. [Google Scholar] [CrossRef]
  20. Zhao, F.; Di, S.; Wang, L. A Hyperheuristic With Q-Learning for the Multiobjective Energy-Efficient Distributed Blocking Flow Shop Scheduling Problem. IEEE Trans. Cybern. 2023, 53, 3337–3350. [Google Scholar] [CrossRef] [PubMed]
  21. Zheng, Q.; Zhang, Y.; Tian, H.; He, L. A cooperative adaptive genetic algorithm for reentrant hybrid flow shop scheduling with sequence-dependent setup time and limited buffers. Complex Intell. Syst. 2023, 10, 781–809. [Google Scholar] [CrossRef]
  22. Luo, Y.; Liang, X.; Zhang, Y.; Tang, K.; Li, W. Energy-Aware Integrated Scheduling for Quay Crane and IGV in Automated Container Terminal. J. Mar. Sci. Eng. 2024, 12, 376. [Google Scholar] [CrossRef]
  23. Müller, A.; Grumbach, F.; Kattenstroth, F. Reinforcement Learning for Two-Stage Permutation Flow Shop Scheduling—A Real-World Application in Household Appliance Production. IEEE Access 2024, 12, 11388–11399. [Google Scholar] [CrossRef]
  24. Wang, Y.-J.; Wang, G.-G.; Tian, F.-M.; Gong, D.-W.; Pedrycz, W. Solving energy-efficient fuzzy hybrid flow-shop scheduling problem at a variable machine speed using an extended NSGA-II. Eng. Appl. Artif. Intell. 2023, 121, 105977. [Google Scholar] [CrossRef]
  25. Schulz, S.; Neufeld, J.S.; Buscher, U. A multi-objective iterated local search algorithm for comprehensive energy-aware hybrid flow shop scheduling. J. Clean. Prod. 2019, 224, 421–434. [Google Scholar] [CrossRef]
  26. Gao, K.; Cao, Z.; Zhang, L.; Chen, Z.; Han, Y.; Pan, Q. A review on swarm intelligence and evolutionary algorithms for solving flexible job shop scheduling problems. IEEE/CAA J. Autom. Sin. 2019, 6, 904–916. [Google Scholar] [CrossRef]
  27. Jiang, E.; Wang, L.; Wang, J. Decomposition-based multi-objective optimization for energy-aware distributed hybrid flow shop scheduling with multiprocessor tasks. Tsinghua Sci. Technol. 2021, 26, 646–663. [Google Scholar] [CrossRef]
  28. Zhang, W.; Xiao, G.; Gen, M.; Geng, H.; Wang, X.; Deng, M.; Zhang, G. Enhancing multi-objective evolutionary algorithms with machine learning for scheduling problems: Recent advances and survey. Front. Ind. Eng. 2024, 2, 1337174. [Google Scholar] [CrossRef]
  29. Geng, K.; Liu, L.; Wu, S. A reinforcement learning based memetic algorithm for energy-efficient distributed two-stage flexible job shop scheduling problem. Sci. Rep. 2024, 14, 30816. [Google Scholar] [CrossRef]
  30. Shi, J.; Liu, W.; Yang, J. An Enhanced Multi-Objective Evolutionary Algorithm with Reinforcement Learning for Energy-Efficient Scheduling in the Flexible Job Shop. Processes 2024, 12, 1976. [Google Scholar] [CrossRef]
  31. Van Hasselt, H.; Guez, A.; Silver, D. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence, Phoenix, AZ, USA, 12–17 February 2016. [Google Scholar] [CrossRef]
  32. Han, B.-A.; Yang, J.-J. Research on adaptive job shop scheduling problems based on dueling double DQN. IEEE Access 2020, 8, 186474–186495. [Google Scholar] [CrossRef]
  33. Zhang, Z.-Q.; Qian, B.; Hu, R.; Yang, J.-B. Q-learning-based hyper-heuristic evolutionary algorithm for the distributed assembly blocking flowshop scheduling problem. Appl. Soft Comput. 2023, 146, 110695. [Google Scholar] [CrossRef]
  34. Chen, X.; Li, Y.; Wang, K.; Wang, L.; Liu, J.; Wang, J.; Wang, X.V. Reinforcement learning for distributed hybrid flowshop scheduling problem with variable task splitting towards mass personalized manufacturing. J. Manuf. Syst. 2024, 76, 188–206. [Google Scholar] [CrossRef]
  35. Deng, L.; Di, Y.; Wang, L. A Reinforcement-Learning-Based 3-D Estimation of Distribution Algorithm for Fuzzy Distributed Hybrid Flow-Shop Scheduling Considering On-Time-Delivery. IEEE Trans. Cybern. 2024, 54, 1024–1036. [Google Scholar] [CrossRef]
  36. Li, R.; Gong, W.; Wang, L.; Lu, C.; Pan, Z.; Zhuang, X. Double DQN-Based Coevolution for Green Distributed Heterogeneous Hybrid Flowshop Scheduling With Multiple Priorities of Jobs. IEEE Trans. Autom. Sci. Eng. 2024, 21, 6550–6562. [Google Scholar] [CrossRef]
  37. Yang, Z.; Bi, L.; Jiao, X. Combining Reinforcement Learning Algorithms with Graph Neural Networks to Solve Dynamic Job Shop Scheduling Problems. Processes 2023, 11, 1571. [Google Scholar] [CrossRef]
  38. Zhao, F.; Yin, F.; Wang, L.; Yu, Y. A Co-Evolution Algorithm With Dueling Reinforcement Learning Mechanism for the Energy-Aware Distributed Heterogeneous Flexible Flow-Shop Scheduling Problem. IEEE Trans. Syst. Man Cybern. Syst. 2025, 55, 1794–1809. [Google Scholar] [CrossRef]
  39. Cao, S.; Li, R.; Gong, W.; Lu, C. Inverse model and adaptive neighborhood search based cooperative optimizer for energy-efficient distributed flexible job shop scheduling. Swarm Evol. Comput. 2023, 83, 101419. [Google Scholar] [CrossRef]
  40. Wang, G.; Li, X.; Gao, L.; Li, P. Energy-efficient distributed heterogeneous welding flow shop scheduling problem using a modified MOEA/D. Swarm Evol. Comput. 2021, 62, 100858. [Google Scholar] [CrossRef]
  41. Liu, Y.; Liao, X.; Zhang, R. An Enhanced MOPSO Algorithm for Energy-Efficient Single-Machine Production Scheduling. Sustainability 2019, 11, 5381. [Google Scholar] [CrossRef]
  42. He, F.; Shen, K.; Guan, L.; Jiang, M. Research on Energy-Saving Scheduling of a Forging Stock Charging Furnace Based on an Improved SPEA2 Algorithm. Sustainability 2017, 9, 2154. [Google Scholar] [CrossRef]
Figure 1. The layout of one hybrid flow shop.
Figure 1. The layout of one hybrid flow shop.
Jmse 13 02176 g001
Figure 2. Flowchart of DDQN-MOCE.
Figure 2. Flowchart of DDQN-MOCE.
Jmse 13 02176 g002
Figure 3. Depiction of the critical path.
Figure 3. Depiction of the critical path.
Jmse 13 02176 g003
Figure 4. Main effect plots of HV.
Figure 4. Main effect plots of HV.
Jmse 13 02176 g004
Figure 5. Comparison result of Pareto fronts of all algorithms of 5-5-20.
Figure 5. Comparison result of Pareto fronts of all algorithms of 5-5-20.
Jmse 13 02176 g005
Figure 6. Comparison result of Pareto fronts of all algorithms of 4-8-50.
Figure 6. Comparison result of Pareto fronts of all algorithms of 4-8-50.
Jmse 13 02176 g006
Figure 7. Comparison result of Pareto fronts of all algorithms of 5-8-100.
Figure 7. Comparison result of Pareto fronts of all algorithms of 5-8-100.
Jmse 13 02176 g007
Table 1. Literature overview in the field of manufacturing scheduling.
Table 1. Literature overview in the field of manufacturing scheduling.
StudyMarine DomainProblemSDSTBufferObjective
(s)
AlgorithmDynamic Operator Selection
Our study√ (Offshore wind blade manufacturing)DHFSP√ (Limited)Makespan,
Tec
EA + DDQN
Shi et al. [10]√ (Container production)HFSP×√ (Only one and infinite)Makespan,
Tec
MGCOA + Q-Learning
Li et al. [11]√ (Ship plane)Blocking FSP×√ (Limited)MakespanGWO + NEH×
Zhong et al. [12]√ (Container logistics)Joint
scheduling of AGVs and YCs
×√ (Limited)TecA bi-level GA×
Yaurima et al. [16]× (Television)HFSP√ (Limited)MakespanGA×
Hakimzadeh et al. [17]× (PCB assembly)HFSP√ (Limited)Makespan,
Tardiness
NSGA-II/SPGA-II×
Han et al. [18]× (Theoretical workshop scheduling)Blocking FSP×Makespan,
Tec
Self-adaptive discrete MOEA×
Zhao et al. [19]× (Theoretical workshop scheduling)Blocking DFSP×Makespan, Tec, TardinessCooperative Whale Optimization Algorithm×
Zhao, Di, and Wang [20]× (Theoretical workshop scheduling)Blocking DFSP××Tec, TardinessHyperheuristic with Q-Learning
Zheng et al. [21]× (Theoretical workshop scheduling)Reentrant HFSPTotal weighted completion timeGA+ACS+Modified Greedy Heuristic×
Müller et al. [23]× (Household appliance production)Two-stage permutation FSP√ (Limited)Idle times, setup effortsPPO
Zhang et al. [33]× (Theoretical workshop scheduling)Assembly blocking DFSP××MakespanHyper-Heuristic EA+Q-Learning
Chen et al. [34]× (Mass personalized manufacturing,Variable Task Splitting DHFSP××Makespan, TecMOEA/D + Q-Learning
Deng et al. [35]× (Mass-customization)Fuzzy DHFSP×Makespan, Tec3D-EDA + Q-Learning
Li et al. [36]× (Large engineering equipment)Heterogeneous DHFSP××Total weighted Tardiness, TecEA + DDQN
Yang et al. [37]× (Smart Factory)Dynamic JSP××The earlier and later completion timeGNN + DDQN
Table 2. Notations and descriptions.
Table 2. Notations and descriptions.
NotationsDescriptions
JJob set, indexed by j, jJ = {0, …, n − 1}
FFactory set, indexed by f
SStage set, indexed by s
MfsThe set of machines at stage s of factory f, indexed by i, iMfsc = {0, …, mfs}
LSpeed set, indexed by v, vL = {1, 2, 3}, corresponding to high-speed, medium-speed, and low-speed
pijsfvThe processing time of job j on machine i at stage s of factory f under speed v
pijj′sfThe setup time of job j to job j′ on machine i at stage s of factory f
τifvThe energy consumption per unit time (kW) on machine i of factory f under speed v
φifvThe idle energy consumption per unit time (kW) on machine i of factory f under speed v
vlSpeed factor of speed level l
bsfThe buffer capacity at stage s of factory f
MA positive number that is large enough
cmaxMaximum completion time (makespan)
tecThe total energy consumption
yjfDecision variable, if job j is allocated in factory f, yjf = 1; otherwise, yjf = 0
xijsfvDecision variable, if job j is processed on machine i at stage s of factory f under speed v, xijsfv = 1; otherwise, xijsfv = 0
zijj′sfDecision variable, if job j′ is the successor of job on machine i at stage s of factory f, zijj′sf = 1; otherwise, zijj′sf = 0
wjsfDecision variable, if job j occupies the buffer at stage s of factory f, wjsf = 1; otherwise, wjsf = 1
sijsfContinuous variable, starting time of job j on machine i at stage s of factory
cijsfContinuous variable, completion time of job j on machine i at stage s of factory
Table 3. Comparison results of D1, D2, and DDQN-MOCE on HV, spacing, and GD.
Table 3. Comparison results of D1, D2, and DDQN-MOCE on HV, spacing, and GD.
Problem ScaleD1D2DDQN-MOCE
HVSpacingGDHVSpacingGDHVSpacingGD
3-2-200.86540.07140.16770.90190.05980.14371.0261 *0.05070.0603
3-5-200.65650.06060.33520.71340.08990.32000.90720.02970.1368
3-8-200.74770.06070.27430.78230.05550.26390.96300.0592 0.1028
4-2-200.79870.05170.21050.84380.05530.19940.83790.04320.1261
4-5-200.80950.09660.26940.85730.06700.2291.02380.04470.0925
4-8-200.57820.06270.40980.65230.06470.37680.92430.03190.1342
5-2-200.78680.08040.28040.84350.08210.23050.89280.06190.1302
5-5-200.75450.0750.32610.80240.05740.27870.92380.03390.1605
5-8-200.68370.0730.29810.71200.07080.27120.87940.06740.1106
3-2-500.73080.07350.21430.79380.08060.17750.93550.07070.0881
3-5-500.81300.06040.20910.85520.07650.18980.93390.05150.1033
3-8-500.59710.07110.33770.64710.06250.30080.92220.04510.1453
4-2-500.74440.04680.23420.83880.06210.18520.80350.03400.1410
4-5-500.67550.07540.31180.7060.07020.2980.85530.06210.1191
4-8-500.74020.06420.22910.7920.05280.19630.90610.0574 0.0827
5-2-500.62370.06210.27540.72290.07030.16420.76620.03840.1398
5-5-500.56410.04680.36110.68830.05180.26670.80570.02650.1387
5-8-500.62430.05260.29990.71040.06770.24870.83470.03630.0947
3-2-1000.79560.05120.22460.87070.08680.16581.00790.0591 0.0885
3-5-1000.47210.05840.42260.69210.0310.2150.78280.0440 0.1505
3-8-1000.77530.02960.20640.81510.03490.18480.98020.0378 0.0835
4-2-1000.78690.05560.23720.87660.05470.15440.91060.0652 0.1356
4-5-1000.79440.06950.20280.89480.04010.13520.93880.0490 0.0997
4-8-1000.4580.04580.34340.60020.03400.16810.66730.01760.1314
5-2-1000.76860.05960.18650.84430.05480.13540.92200.0586 0.0867
5-5-1000.52520.04170.50130.7670.04090.31210.71960.03990.2825
5-8-1000.74190.04570.26320.86790.04610.17110.98660.0516 0.0986
3-2-2000.63070.0520.36750.81560.06460.19780.88420.0686 0.1406
3-5-2000.64940.02710.21720.76140.02640.17280.92850.0377 0.0646
3-8-2000.52270.05290.35970.72940.06980.19830.73950.0588 0.1419
4-2-2000.42530.05150.48720.61950.05000.19440.81080.04410.1347
4-5-2000.40350.04160.5710.62640.03650.28190.74760.02280.2512
4-8-2000.64180.02770.29140.82680.03090.18850.96030.0365 0.0781
5-2-2000.6570.06370.29250.75720.05890.21330.75090.04440.2310
5-5-2000.61010.06270.31980.77270.05860.14280.85570.0677 0.1458
5-8-2000.29780.04060.58100.53280.03620.33730.69590.03040.1999
* Bold values with a gray background highlight the superior performance.
Table 4. Comparison results of D1, D2, and DDQN-MOCE on C-metric.
Table 4. Comparison results of D1, D2, and DDQN-MOCE on C-metric.
Problem ScaleDDQN-MOCE vs. D1DDQN-MOCE vs. D2D1 vs. D2
C(DDQN-MOCE, D1)C(D1,
DDQN-MOCE)
C(DDQN-MOCE,
D2)
C(D2,
DDQN-MOCE)
C(D1, D2)C(D2, D1)
3-2-200.9190 *0.01500.88430.05060.25420.5711
3-5-200.79710.01250.74960.01000.30200.5825
3-8-200.74530.01710.76310.010.27220.4859
4-2-200.64920.11250.5040.13190.34920.5411
4-5-200.89210.01000.88070.010.22140.5982
4-8-200.930100.865600.26700.5896
5-2-200.70620.06140.66480.08550.23930.5643
5-5-200.7620.00830.73860.02250.30150.5432
5-8-200.73930.0050.76940.00500.31200.6115
3-2-500.89490.03280.86650.04050.20650.6764
3-5-500.73160.04010.71200.14040.30750.5778
3-8-500.82940.01550.69650.01270.27240.6567
4-2-500.46540.05560.46120.13060.17760.6863
4-5-500.61820.03510.67350.05360.30460.5196
4-8-500.79570.03690.74050.05420.27100.5985
5-2-500.64700.45650.11460.14050.7785
5-5-500.779100.68550.01250.17010.7332
5-8-500.7760.03940.74490.06100.19440.7225
3-2-1000.86470.03540.74980.11700.17660.7108
3-5-1000.70360.0530.43250.09670.01670.8703
3-8-1000.95470.01670.91680.03010.25040.6254
4-2-1000.79460.11430.55230.29600.16370.7684
4-5-1000.81510.09850.64210.20990.11610.8272
4-8-1000.37170.01670.10350.02500.05070.8935
5-2-1000.78960.11370.61980.26930.15340.7164
5-5-1000.69180.01960.38740.18710.01830.8841
5-8-1000.96790.02110.8180.10910.04950.8823
3-2-2000.8940.02510.67840.23590.04930.889
3-5-2000.97980.00200.96050.02490.19590.7300
3-8-2000.69070.00120.46630.14350.03670.8832
4-2-2000.71450.03940.1640.63610.01630.9142
4-5-2000.45870.02500.22470.28330.03680.8821
4-8-2000.97810.00650.86720.09070.09180.8404
5-2-2000.66370.16390.35310.44030.15140.7339
5-5-2000.60650.10520.29750.42070.08810.8479
5-8-2000.527500.11690.32500.9292
* Bold values with a gray background highlight the superior performance.
Table 5. Statistical summary of performance ranks and Friedman test results for ablation study variants (D1, D2, and DDQN-MOCE, significant level = 0.05).
Table 5. Statistical summary of performance ranks and Friedman test results for ablation study variants (D1, D2, and DDQN-MOCE, significant level = 0.05).
VariantsHVSpacingGD
Rank χ 2 p-ValueRank χ 2 p-ValueRank χ 2 p-Value
D13.0064.898.12 × 10−152.3110.066.55 × 10−33.0068.221.53 × 10−15
D21.892.111.94
DDQN-MOCE1.111.581.06
Table 6. Statistical results of all comparison algorithms on HV.
Table 6. Statistical results of all comparison algorithms on HV.
Problem ScaleDDQN-MOCENSGA-IIMOEA/DMOPSOSPEA2
MeanStdMeanStdMeanStdMeanStdMeanStd
3-2-201.0677 *0.06100.23390.03060.57020.16580.20160.01950.36960.1336
3-5-201.03330.07850.2466−0.03100.4344−0.08540.2556−0.02290.4055−0.1090
3-8-201.03690.07210.4569−0.05130.5755−0.12780.4010−0.03720.5757−0.1175
4-2-201.00960.07910.365−0.05510.5644−0.15340.2892−0.03150.4025−0.0994
4-5-200.98070.07650.3639−0.05480.5033−0.09630.4427−0.03690.5166−0.0812
4-8-200.95690.07230.4560−0.05700.5314−0.12220.4573−0.03780.6158−0.1179
5-2-200.93690.10010.5679−0.07310.7036−0.13590.4877−0.04880.6364−0.1526
5-5-200.93850.06550.4055−0.06150.5635−0.14450.3693−0.03600.5518−0.1480
5-8-200.84550.11930.4605−0.08850.5581−0.11830.5071−0.05780.5966−0.0977
3-2-500.91620.05470.3771−0.03650.4610−0.05690.5459−0.02900.6008−0.0384
3-5-500.86380.06870.2106−0.02650.2871−0.05730.2504−0.02470.2956−0.0775
3-8-501.01650.07580.1878−0.03460.2472−0.05390.2517−0.02970.3039−0.0543
4-2-500.93560.08710.1358−0.01470.2763−0.07640.1197−0.00790.2016−0.0596
4-5-500.92840.08630.2086−0.03760.2766−0.06470.2285−0.02410.3108−0.0527
4-8-500.99880.07220.2374−0.03060.2648−0.07350.2592−0.01630.3119−0.0422
5-2-501.02910.08060.1415−0.02120.2691−0.06530.1473−0.01510.2119−0.0844
5-5-501.04630.06880.1672−0.01620.2394−0.05350.1751−0.01510.2498−0.0590
5-8-500.90280.05300.373−0.04300.6562−0.18160.3412−0.03140.5505−0.1341
3-2-1000.79650.05360.2008−0.03050.2497−0.04490.2889−0.02010.3358−0.0529
3-5-1001.06760.08190.1587−0.01240.2045−0.03390.1682−0.01300.1943−0.0348
3-8-1000.90660.02280.2305−0.00700.2410−0.02370.2400−0.01350.2674−0.0120
4-2-1000.94990.06160.2358−0.02520.3246−0.06330.2307−0.01360.3206−0.0498
4-5-1000.84420.06090.1901−0.02340.2204−0.02840.1857−0.01800.2474−0.0358
4-8-1000.97030.14030.0709−0.00850.1054−0.02870.0770−0.00500.1065−0.0184
5-2-1000.87450.04550.3043−0.02510.3257−0.03970.4155−0.02290.4186−0.0254
5-5-1000.99040.08690.1141−0.01620.1689−0.03280.1236−0.01040.1775−0.0353
5-8-1000.84010.03510.2012−0.03210.2127−0.01610.2145−0.01550.2577−0.0238
3-2-2000.87160.03860.1818−0.01850.2396−0.04330.1967−0.01230.2418−0.0350
3-5-2000.90190.02200.2227−0.01520.2271−0.02610.2327−0.00980.2490−0.0250
3-8-2001.05190.07280.1102−0.01800.1701−0.02560.1182−0.01510.1469−0.0221
4-2-2000.92740.09870.1216−0.00980.2091−0.04740.1353−0.00940.1744−0.0461
4-5-2001.00670.09670.1075−0.01290.1295−0.02220.1244−0.00880.1510−0.0237
4-8-2000.84100.02440.1861−0.00860.2032−0.02670.2016−0.00680.2116−0.0211
5-2-2000.83500.09140.1938−0.03240.2583−0.04690.3284−0.02610.3431−0.0313
5-5-2001.10510.01960.1350−0.00780.1750−0.01870.1499−0.01170.1966−0.0095
5-8-2001.05420.40010.0575−0.01380.1017−0.03670.0622−0.01470.1046−0.0416
* Bold values with a gray background highlight the superior performance.
Table 7. Statistical results of all comparison algorithms on spacing.
Table 7. Statistical results of all comparison algorithms on spacing.
Problem ScaleDDQN−MOCENSGA−IIMOEA/DMOPSOSPEA2
MeanStdMeanStdMeanStdMeanStdMeanStd
3-2-200.02830.01850.0126≈ *0.01550.0158≈0.02350.0426−0.02190.0146≈0.0144
3-5-200.02840.02060.0213≈0.02440.0100+0.01430.0288≈0.02460.0280≈0.0129
3-8-200.05320.04100.0136+0.01990.0268+0.03200.0444≈0.03680.0432≈0.0303
4-2-200.02530.02740.0056+0.00940.0151≈0.01700.0638−0.05590.0282≈0.0318
4-5-200.05020.04290.0419≈0.04510.0344≈0.03520.0430≈0.03360.0411≈0.0340
4-8-200.06460.03980.0222+0.02170.0370+0.02850.0820−0.06620.0410+0.0260
5-2-200.03370.03950.0123+0.01960.0223+0.03920.0554−0.04960.0325≈0.0476
5-5-200.04260.02340.0253+0.03610.0214+0.02790.0626−0.04020.0491≈0.0373
5-8-200.04880.04130.0474≈0.03870.0250+0.03540.0549≈0.04360.0486≈0.0358
3-2-500.04000.03610.0358≈0.02540.0264≈0.03660.0682−0.06070.0529≈0.0324
3-5-500.03640.03200.0350≈0.03530.0343≈0.03840.0517−0.03850.0322≈0.0206
3-8-500.04600.02570.0470≈0.05890.0330≈0.03600.0478≈0.03630.0550≈0.0333
4-2-500.00540.00180.0014≈0.00460.0015≈0.00650.0224−0.01670.0046≈0.0100
4-5-500.04620.03160.0459≈0.04460.0321≈0.03460.0550≈0.05490.046≈0.0294
4-8-500.06530.03920.0372+0.03500.0317+0.03590.0834−0.06690.0371+0.0282
5-2-500.02720.03040.0127≈0.02610.0141≈0.02800.0382≈0.02920.0268≈0.0320
5-5-500.02240.02440.0276≈0.02330.0304≈0.02180.0412−0.03650.0320≈0.0229
5-8-500.07180.04240.0199+0.02130.0362+0.02870.0627≈0.06070.0290+0.0193
3-2-1000.04570.03880.0460≈0.03540.0282+0.03660.0644−0.02810.0439≈0.0193
3-5-1000.03150.02550.0163≈0.01680.0316≈0.03590.0540−0.04460.0296≈0.0259
3-8-1000.05120.01410.0151+0.00160.0250+0.01160.0338+0.01730.0287+0.0072
4-2-1000.03920.03230.0068+0.01440.0293≈0.02650.0708−0.03800.0183+0.0234
4-5-1000.04380.03620.0326≈0.03410.026+0.03160.0690−0.04240.0471≈0.0375
4-8-1000.01670.01820.0233≈0.01940.0210≈0.01830.0360−0.01180.0153≈0.0069
5-2-1000.03650.03610.0456≈0.03140.0345≈0.03200.0599−0.05200.0350≈0.0210
5-5-1000.05400.04040.0168+0.01850.0213+0.01700.0478≈0.02580.03280.0245
5-8-1000.06310.01550.0248+0.01840.0335+0.01870.0688≈0.02470.0418+0.0229
3-2-2000.03990.02150.0336≈0.03200.0206≈0.02490.0580−0.03760.0231≈0.0158
3-5-2000.03890.02250.0181+0.01720.0188+0.02170.0280≈0.01600.0239≈0.0196
3-8-2000.05410.02970.0302+0.02460.0380≈0.02370.0602≈0.03430.0413≈0.0243
4-2-2000.04120.03440.0253+0.02820.0223+0.02730.0721−0.04660.0168+0.0230
4-5-2000.02370.02420.0246≈0.02010.0201≈0.02840.0630−0.05770.0369≈0.0276
4-8-2000.03620.01900.0125+0.01120.0240≈0.01750.0308≈0.03260.0191+0.0087
5-2-2000.06460.04460.0427+0.05870.0246+0.03580.0960−0.07560.0486+0.0319
5-5-2000.05800.03350.0441≈0.01980.0446≈0.02990.1342−0.03890.0563≈0.0387
5-8-2000.01880.02330.0208≈0.02290.0176≈0.00830.0294≈0.02940.0300≈0.0132
* Bold values with a gray background highlight the superior performance. The notations “−/+” indicate that the compared algorithm is worse/better than DDQN-MOCE, while “≈” denotes no significant difference between them.
Table 8. Statistical results of all comparison algorithms on GD.
Table 8. Statistical results of all comparison algorithms on GD.
Problem ScaleDDQN-MOCENSGA-IIMOEA/DMOPSOSPEA2
MeanStdMeanStdMeanStdMeanStdMeanStd
3-2-200.0573 *0.02880.74820.03250.34540.14320.82370.04530.55620.1498
3-5-200.09100.04020.72390.04920.52470.08000.80190.04520.59150.0968
3-8-200.07310.03040.43320.04570.42110.11930.62460.04930.46200.0943
4-2-200.06760.03190.51550.06660.37820.13000.72150.05710.54920.1030
4-5-200.09790.04450.54330.07240.45950.09930.56370.04760.48900.0687
4-8-200.10970.06110.49000.06790.49580.13180.60680.05740.45220.1029
5-2-200.10970.05490.26230.05790.21670.10920.45380.06650.27410.1059
5-5-200.10190.04080.44100.06190.02140.02790.51000.03450.37320.1137
5-8-200.12060.05150.39990.05700.28770.09580.37050.04860.31480.0552
3-2-500.09740.03690.46130.04400.33040.07870.28650.03150.25330.0286
3-5-500.13080.05170.53990.07680.32790.07910.45800.06740.38070.0944
3-8-500.07060.03370.76980.04130.76170.04600.77610.03540.73700.0385
4-2-500.08650.04710.82530.03070.63240.09760.92620.02690.73960.0941
4-5-500.17100.06520.83080.04200.74610.09000.84350.04000.74950.0518
4-8-500.06840.02140.66640.03660.59820.09960.65560.02440.57370.0506
5-2-500.10450.04860.93570.03540.76880.08180.99650.03790.85940.1118
5-5-500.07380.03500.89580.03320.80430.07510.93940.03360.80860.0661
5-8-500.10420.03060.44600.04070.18560.08850.46880.03260.28700.0917
3-2-1000.11470.03810.54190.07080.38920.07270.39350.04220.33050.0743
3-5-1000.08720.04640.90680.02750.86360.04130.95220.02500.89050.0405
3-8-1000.04770.00910.21470.02590.09290.03080.18050.02230.12480.0499
4-2-1000.08280.03580.35460.06340.22750.06990.34340.04320.17650.0492
4-5-1000.12740.04040.43110.06360.25540.10140.39570.04010.23490.0814
4-8-1000.14410.10001.16100.02421.16980.01511.07740.06101.08910.0423
5-2-1000.10430.02420.31760.04630.27170.05560.21380.04480.18010.0388
5-5-1000.13310.06910.98020.03400.91270.04761.01290.02640.91250.0477
5-8-1000.06470.01990.26790.08250.13620.03360.24550.06390.17840.0317
3-2-2000.08510.02920.35070.05620.19130.07990.29610.03950.20460.0644
3-5-2000.05960.01330.18520.03270.12360.03960.14450.02170.10350.0298
3-8-2000.05610.02580.84260.12360.71640.24390.83730.16700.65770.2110
4-2-2000.14520.06420.95110.02000.97310.01730.97310.01730.88800.0491
4-5-2000.12100.05751.03270.02860.98350.03901.02740.02100.97610.0419
4-8-2000.07450.01700.17160.04060.08500.03880.13710.02100.08490.0332
5-2-2000.24940.09760.64160.09670.47300.11410.46080.05580.43750.0775
5-5-2000.05100.02660.78800.12880.72670.13700.77770.15680.67800.2039
5-8-2000.01270.03811.17010.06771.05340.09781.18870.07791.07790.1117
* Bold values with a gray background highlight the superior performance.
Table 9. Statistical results of all comparison algorithms on C-metric.
Table 9. Statistical results of all comparison algorithms on C-metric.
Problem ScaleDDQN-MOCE vs. NSGA-IIDDQN-MOCE vs. MOEA/DDDQN-MOCE vs. MOPSODDQN-MOCE vs. SPEA2
AA′BB′CC′DD′
3-2-201 *0101010
3-5-20100.93000100.98570
3-8-20100.93330.0063100.95830
4-2-20100.983301010
4-5-20100.96250100.98330
4-8-20100.73920.010.950000.72960.0183
5-2-200.950.00830.77860.0424100.75000.0563
5-5-20100.73000100.89590
5-8-200.92920.01670.59750.01330.808300.77610.0260
3-2-50100.833300.61670.03530.63590.0324
3-5-50100.750800.933300.87580
3-8-501010100.99170
4-2-5010101010
4-5-50100.950100.96440
4-8-50100.9500.933300.89730
5-2-5010101010
5-5-5010101010
5-8-50100.38310.1375100.66190
3-2-100100.870800.82500.01000.73150.0100
3-5-10010101010
3-8-1000.900000.246700.733300.76330
4-2-1000.987500.68330.01150.90000.00760.49380.016
4-5-1000.983300.70420.02010.866700.67910.011
4-8-10010101010
5-2-10010100.683300.72400.0045
5-5-10010101010
5-8-1000.785000.48830.0050.50450.01130.52930.0125
3-2-200100.716700.883300.70280
3-5-2000.975000.79580.00350.66670.00630.52560.0158
3-8-200100.91070100.97020
4-2-2001010100.94170
4-5-20010101010
4-8-2000.90000.00160.52210.00910.67500.00760.52750.0112
5-2-200100.962500.933300.95990
5-5-2000.875000.804700.875000.81790
5-8-20010101010
* Bold values with a gray background highlight the superior performance.
Table 10. Statistical results of all comparison algorithms of 5-5-20.
Table 10. Statistical results of all comparison algorithms of 5-5-20.
AlgorithmHVSpacingGDSolution Count
DDQN-MOCE0.9140 *0.04690.104410
NSGA-II0.4143 0.0010 0.4426 2
MOEA/D0.4389 0.0080 0.3042 4
MOPSO0.3398 0.0313 0.5334 3
SPEA20.3382 0.0580 0.5389 5
* Bold values highlight the superior performance.
Table 11. Statistical results of all comparison algorithms of 4-8-50.
Table 11. Statistical results of all comparison algorithms of 4-8-50.
AlgorithmHVSpacingGDSolution Count
DDQN-MOCE0.9778 *0.14470.104612
NSGA-II0.1981 0.0026 0.7145 4
MOEA/D0.2848 0.1494 0.5684 3
MOPSO0.2474 0.0102 0.6577 3
SPEA20.3995 0.0322 0.5325 11
* Bold values highlight the superior performance.
Table 12. Statistical results of all comparison algorithms of 5-8-100.
Table 12. Statistical results of all comparison algorithms of 5-8-100.
AlgorithmHVSpacingGDSolution Count
DDQN-MOCE0.9044 *0.11510.049413
NSGA-II0.1509 0.0185 0.3821 3
MOEA/D0.1680 0.0963 0.3523 7
MOPSO0.1951 0.0315 0.2528 3
SPEA20.2873 0.0216 0.2045 10
* Bold values highlight the superior performance.
Table 13. Statistical summary of performance ranks and Friedman test results for comparison algorithms (DDQN-MOCE, NSGA-II, MOEA/D, MOPSO, and SPEA2, significant level = 0.05).
Table 13. Statistical summary of performance ranks and Friedman test results for comparison algorithms (DDQN-MOCE, NSGA-II, MOEA/D, MOPSO, and SPEA2, significant level = 0.05).
AlgorithmsHVSpacingGD
Rank χ 2 p-ValueRank χ 2 p-ValueRank χ 2 p-Value
DDQN-MOCE1.00124.426.07 × 10−263.6988.892.27 × 10−181.03111.513.46 × 10−23
NSGA-II4.751.924.33
MOEA/D2.861.812.65
MOPSO4.064.754.35
SPEA22.332.832.64
Table 14. Statistical results of a real-world case for HV, spacing, GD, and time.
Table 14. Statistical results of a real-world case for HV, spacing, GD, and time.
MetricAlgorithmMeanStdBestWorst
HVDDQN-MOCE0.7622 *0.03370.84270.7083
NSGA-II0.31110.02990.35770.227
MOEA/D0.32470.07870.47890.2027
MOPSO0.32520.02630.37960.2842
SPEA20.40130.07450.52760.2627
SpacingDDQN-MOCE0.05970.0460.01610.1691
NSGA-II0.02240.027600.1176
MOEA/D0.03670.033200.1154
MOPSO0.03290.02310.00160.0915
SPEA20.03370.01970.00140.0737
GDDDQN-MOCE0.08510.03390.01790.1423
NSGA-II0.25720.06130.11960.404
MOEA/D0.17720.0840.01530.3861
MOPSO0.25940.04360.17790.3394
SPEA20.16740.06330.05290.3165
Time
(seconds)
DDQN-MOCE500.9153.34380.54633.27
NSGA-II2441.01174.242076.112558.04
MOEA/D882.9662.00756.09927.11
MOPSO636.3639.80542.45662.86
SPEA2305.6622.20261.38341.79
* Bold values with a gray background highlight the superior performance.
Table 15. Statistical results of a real-world case for the C-metric.
Table 15. Statistical results of a real-world case for the C-metric.
C-metricDDQN-MOCENSGA-IIMOEA/DMOPSOSPEA2
DDQN-MOCE-1 *0.511
NSGA-II0-00.66670
MOEA/D00-10.75
MOPSO001-0.75
SPEA20000-
* Bold values with a gray background highlight the superior performance.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, Q.; Zhang, Q.; Duan, J.; Qin, J.; Zhou, Y. Energy-Efficient Scheduling for Distributed Hybrid Flowshop of Offshore Wind Blade Manufacturing Considering Limited Buffers. J. Mar. Sci. Eng. 2025, 13, 2176. https://doi.org/10.3390/jmse13112176

AMA Style

Zhang Q, Zhang Q, Duan J, Qin J, Zhou Y. Energy-Efficient Scheduling for Distributed Hybrid Flowshop of Offshore Wind Blade Manufacturing Considering Limited Buffers. Journal of Marine Science and Engineering. 2025; 13(11):2176. https://doi.org/10.3390/jmse13112176

Chicago/Turabian Style

Zhang, Qinglei, Qianyuan Zhang, Jianguo Duan, Jiyun Qin, and Ying Zhou. 2025. "Energy-Efficient Scheduling for Distributed Hybrid Flowshop of Offshore Wind Blade Manufacturing Considering Limited Buffers" Journal of Marine Science and Engineering 13, no. 11: 2176. https://doi.org/10.3390/jmse13112176

APA Style

Zhang, Q., Zhang, Q., Duan, J., Qin, J., & Zhou, Y. (2025). Energy-Efficient Scheduling for Distributed Hybrid Flowshop of Offshore Wind Blade Manufacturing Considering Limited Buffers. Journal of Marine Science and Engineering, 13(11), 2176. https://doi.org/10.3390/jmse13112176

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop