A Knowledge-Based Cooperative Differential Evolution Algorithm for Energy-Efﬁcient Distributed Hybrid Flow-Shop Rescheduling Problem

: Due to the increasing level of customization and globalization of competition, rescheduling for distributed manufacturing is receiving more attention. In the meantime, environmentally friendly production is becoming a force to be reckoned with in intelligent manufacturing industries. In this paper, the energy-efﬁcient distributed hybrid ﬂow-shop rescheduling problem (EDHFRP) is addressed and a knowledge-based cooperative differential evolution (KCDE) algorithm is proposed to minimize the makespan of both original and newly arrived orders and total energy consumption (simultaneously). First, two heuristics were designed and used cooperatively for initialization. Next, a three-dimensional knowledge base was employed to record the information carried out by elite individuals. A novel DE with three different mutation strategies is proposed to generate the offspring. A local intensiﬁcation strategy was used for further enhancement of the exploitation ability. The effects of major parameters were investigated and extensive experiments were carried out. The numerical results prove the effectiveness of each specially-designed strategy, while the comparisons with four existing algorithms demonstrate the efﬁciency of KCDE in solving EDHFRP.


Introduction
Productivity has become one of the core competencies in the manufacturing industry, particularly with the integration of the world economy.However, with the ongoing trend of mass customization and the accelerated variety of products, a growing number of manufacturing companies have realized the importance of environmentally friendly manufacturing.As the most energy-consuming and CO 2 emission entity, China's energy structure highly depends on coal.More than half of the electrical production is fueled by coal, while over 70% of power plants are coal-fired [1].Energy-efficient scheduling of manufacturing industries is an effective way to reduce carbon emissions [2].Therefore, it is of extraordinary significance to develop production schedules for intelligent manufacturing industries with energy-related constraints and objectives.
By extending the production model into the distributed environment and establishing multiple factories in various remote geographical locations, distributed manufacturing has been widely applied in various fields, including automotive, steel-making, and chemical processing [3].In the past few decades, various evolutionary algorithms, such as artificial bee colony [4], iterated greedy algorithm [5], Jaya algorithm [6], and memetic algorithm [7], have been proposed to solve distributed shop scheduling problems with different constraints.The authors of [8] focused on the distributed parallel machine-scheduling problem and proposed a knowledge-based two-population optimization algorithm to minimize the energy consumption and total tardiness simultaneously.A distributed flow-shop group scheduling problem was investigated and solved by a cooperative co-evolutionary algorithm with a novel collaboration model and a reinitialization scheme in [9].The authors of [10] studied a distributed heterogeneous hybrid flow-shop scheduling problem (FSP) under nonidentical time-of-use electricity tariffs, of which, the makespan and total electricity charges were considered as optimization objectives.Distributed hybrid differentiation FSP was investigated in [11], for which a distributed co-evolutionary memetic algorithm was proposed to minimize the makespan.Recently, the combination of distributed flowshop and parallel machines, namely the distributed hybrid flow-shop scheduling problem (DHFSP), has emerged as a practical production model.In order to provide an efficient productive schedule for DHFSP, ref. [7] proposed a bi-population cooperative memetic algorithm.For energy-aware DHFSP with multiprocessor tasks, [12] proposed a multi-objective evolutionary algorithm based on decomposition while [13] designed a knowledge-based multi-objective algorithm.
Uncertainties always exist in real-life manufacturing.Unlike determinate processing, rescheduling aims to make schedules under dynamic environments, dealing with unpredicted events, such as new job arrivals and machine breakdowns.Rescheduling in FSP with variable processing times by using real-time information was studied in [14].Rescheduling jobs in flexible job-shop scheduling problem (FJSP) is receiving more attention.FJSP (with newly arrived priority jobs) was investigated in [15], while machine breakdowns and FJSP recovery were considered in [16,17].The authors of [18,19] took both real-time order acceptance and machine maintenance into consideration when rescheduling the FJSP.Furthermore, [20] provided energy-efficient schedules for FJSP with dynamic variations occurring frequently during production.The authors of [21] focused on rescheduling for the hybrid flow-shop scheduling problem (HFSP), in which energy consumption was regarded as an objective and the real-time order acceptance was considered.Moreover, [22] focused on steelmaking systems and proposed a hybrid fruit fly optimization algorithm to solve the rescheduling problem in HFSP with machine breakdowns and processing variations, while the release variation was considered in [23].Moreover, rescheduling for a single-machine problem was studied in [24], in which the total waiting time was set as an objective and the newly arrived reworked jobs were considered.Rescheduling (in several identical parallel-machine problems) was investigated in [25]; the sequence-dependent setup times, release times, and newly arrived jobs were considered.
Inspired by the natural evolution of species, differential evolution (DE) is a simple (yet efficient) heuristic for global optimization over continuous spaces [26]; it has been improved in the past few decades and widely applied as a state-of-the-art global optimization technique in a variety of scheduling problems [27].In [28], DE was combined with a local search mechanism to minimize the makespan and tardiness of FSP simultaneously.In order to solve permutation FSP while minimizing the makespan, optimization algorithms based on discrete DE were designed in [29], and a DE with the algebraic differential mutation was designed in [30], considering the total flowtime criterion.Time-of-use electricity tariffs were considered in [31], where a multi-objective discrete DE with novel mutations and crossover operations was developed.The authors of [32] combined DE with heuristic strategies in order to solve FJSP with outsourcing operations and job priority constraints, while a new mutation strategy based on searching was designed for DE in [33] to minimize the makespan in FJSP with reconfigurable machine tools.Moreover, DE was used for scheduling hybrid FSP [34] and JSP with fuzzy processing times [35,36].In [8], DE performed cooperatively with NSGA-II on the population in parallel and provided efficient solutions for the distributed energy-efficient parallel machine-scheduling problem.Moreover, DE is used for solving other scheduling problems, such as robotic cell scheduling with batch-processing machines [37] and a single batch-processing machine with arbitrary job sizes and release times [38].
This study aims to solve the energy-efficient distributed hybrid flow-shop rescheduling problem (EDHFRP) by minimizing the makespan of both the first-order and the newly arrived order, along with the total energy consumption.EDHFRP can be considered as an extension of the existing DHFSP, which is studied in [7,12], by adding constraints, including energy-efficient and newly-arrived jobs.The mathematical model of EDHFRP is provided in Section 2.1, appending the calculation of energy consumption and the description of rescheduling on the basis of the model of DHFSP defined in [7,12].In order to solve this problem, a knowledge-based cooperative differential evolution (KCDE) algorithm is proposed.In KCDE, two heuristics are designed for initialization according to the characteristics of EDHFRP.Then, a three-dimensional knowledge base is proposed to describe the information carried out by elite individuals.Moreover, three novel differential mutation strategies along with a crossover method are employed to the knowledge base for evolution.Additionally, a local intensification strategy was designed for elite individuals in order to further improve the quality of solutions.Finally, extensive comparisons and tests were carried out to verify the efficiency and effectiveness of the proposed KCDE in solving EDHFRP.
The remainder of this paper is divided into four parts.Section 2 describes EDHFRP in detail and provides the mathematical model.Every strategy and design of KCDE is introduced in Section 3. The numerical results and discussion are shown in Section 4. Finally, the conclusion provides a brief summary and critique of the findings in Section 5.

Problem Description
The indices, parameters, variables and decision variables used in the following description is shown as follows.

Indices
Description The first order.O 2 The newly arrived order during the processing of O Index of jobs under processing at time t a ,

Decision Variables Description
x j, f Binary variable whose value equals 1 when job j is assigned to factory f or 0 otherwise.y j, f ,k,i Binary variable whose value equals 1 when job j is assigned to machine i at stage k in factory f or 0 otherwise.z j,j , f ,k,i Binary variable whose value equals 1 when job j is processed before j on machine i at stage k in factory f or 0 otherwise.
In a typical DHFSP, the total number of n jobs need to be processed on a set of machines.Each job includes s stages and there are one or more machines for each stage.Moreover, the machines belong to F-geographically dispersed factories.Each job needs to be assigned to one of the factories before processing, and should not be removed from the certain factory until completion.Therefore, in order to schedule DHFSP, three sub-problems demand solving, namely factory assignment, machine selection, and job sequencing.
In EDHFRP, during the processing of n 1 jobs in the first-order O 1 , another order O 2 with n 2 jobs arrives at time t a and needs to be completed as soon as possible.At time t a , the possible states of jobs in O 1 are completed, under processing, and unprocessed.For jobs in the first two states, the original schedule is kept, while the unprocessed jobs in O 1 need rescheduling along with the new jobs in O 2 .However, during the rescheduling, the available time of each machine changes from 0 to the completion time of jobs under processing at time t a .Furthermore, the unprocessed jobs in O 1 have to remain in the factories to which they are assigned, while the newly arrived jobs in O 2 can be assigned to any factory.Therefore, the complexity of scheduling EDHFRP is much higher than the typical DHFSP.
The Gantt chart of a solution of an instance with n 1 = 8, n 2 = 5, s = 3, F = 2, and t a = 30 is illustrated in Figure 1.The original schedule for O 1 is shown in Figure 1a; it can be seen that job 1 and job 7 in factory 1, as well as job 8 in factory 2 are unprocessed when new order O 2 arrives at t a .Thus, the three unprocessed jobs along with jobs in O 2 are rescheduled, while job 1 and job 7 still remain in factory 1 and job 8 in factory 2. The processing plan after rescheduling is shown in Figure 1b, in which the rectangles labeled II-1∼II-5 denote the jobs in O 2 .The makespan of O 1 and O 2 are 163 and 195 min, respectively, and the total energy consumption can be calculated as 134.08 kJ.

Mathematical Model
The mixed integer linear programming model for EDHFRP minimizing MS 1 , MS 2 and TEC is formulated as follows: Subject to: S j 1 ,1 ≥ 0 ∀j 1 (4) x j, f ∈ {0, 1} ∀j, f where Equation ( 1) describes the three optimization objectives, Equation ( 2) assures that only one factory must be selected for each job, Equation (3) ensures that the processing of each stage of each job must happen on only one machine, Equation ( 4) guarantees all jobs in O 1 must be processed after time 0, Equation ( 5) makes sure that each job in O 1 is either under-processed or unprocessed at time t a , Equation ( 6) defines the calculation of the completion time, Equation ( 7) ensures that each stage can only be processed after the completion of the previous one, Equations ( 8)-( 10) guarantee that each machine can process-at most-one job at any time, Equations ( 11)-( 12) define the makespan of two orders, Equations ( 13)-( 15) calculate the total energy consumption.Equations ( 16)-( 18) indicate the binary decision variables.

KCDE for EDHFRP
The classical DE refers to an efficient and powerful population-based stochastic search technique for solving optimization problems over a continuous space, aiming to evolve a population of D-dimensional parameter vectors, which encode the candidate solutions toward the global optimum.However, as a discrete optimization problem, the parameter vectors built for job-shop scheduling problems are limited to being positive integers according to the constraints; thus, it is difficult for DE to reach its full potential when directly used in solving job-shop scheduling problems.The knowledge base proposed in this paper is capable of making an explicit connection between continuous space and discrete parameter vectors.In this section, the representation of the solution is first introduced briefly, following by the details of four vital components of KCDE, including hybrid initialization, knowledge base, cooperative differential mutation, and local intensification.

Solution Representation
For each job in EDHFRP, the factory assignment should be determined first, then a machine is selected for each stage, while the processing sequence on each machine should be determined at the same time.In this paper, the efficient and widely used permutation-encoding method is employed to represent the job sequence in each factory.Machine assignments are obtained through decoding operations according to the job sequence and factory assignment.Therefore, the solutions of EDHFRP can be encoded as represents the processing sequence of jobs at the first stage in factory f .Considering the feasibility of the solutions, each job should occur in the job sequence of any factory once and only once.
To be specific, the corresponding solution of the Gantt chart in Figure 1a-before new order arrival-is shown in Figure 2a, from which it can be seen that jobs 3, 5, 7, and 1 are processed in factory 1 sequentially, while the first job processed in factory 2 is job 4, followed by jobs 2, 6, and 8.When O 2 arrives at time t a , jobs 3, 5, 4, and 2 are being processed, while jobs 7, 1, 6, and 8 remain unprocessed.Therefore, the 4 unprocessed jobs along with the 5 new jobs are rescheduled as in Figure 2b, in which jobs 7, 1, 6, and 8 are still in the factories they were assigned to at first.Job sequence in factory 1.
Job sequence in factory 2. Job sequence in factory 1.
Job sequence in factory 2.
Job sequence in factory 2.

Hybrid Initialization
In order to strike a balance between the quality and variety of the initial population, three different initialization strategies, including two EDHFRP characteristic-based heuristics and a random method, were designed and employed in a cooperative manner.
It has been conclusively shown that the Nawaz-Enscore-Ham (NEH) heuristic [39] is one of the most effective heuristics for solving FSP [40].Recently, the variants of NEH heuristics have been widely used in DFSP and DHFSP [5,6,41,42].In this paper, a greedy NEH heuristic (GNEH) and a NEH heuristic based on lower bounds (NEHLB) are proposed.
For both heuristics, the initial job sequences are generated randomly and the jobs are arranged sequentially.The main method of GNEH is described as Algorithm 1 in detail.First, each job is inserted into all possible position in each factories.Next, the values of MS 1 and TEC of each insertion are calculated.Then, the insertion with the best result is selected.Moreover, NP/3 individuals are generated through GNEH.The NEHLB is designed as Algorithm 2 and implemented to generate NP/3 individuals.For each job, the current LB MS 1 and LB TEC of each factory are calculated first.Next, the factory with the smallest value is chosen.Then, the job is inserted into one of the possible positions in a certain factory (randomly).In order to increase the diversity of the initial population, the rest of the population is generated randomly.

3D Knowledge Base
Taking the features of DE and EDHFRP into consideration, a three-dimensional knowledge base is designed to store the information of elite individuals.In the proposed knowledge base, both x-axis and y-axis represent the jobs, while the z-axis represents the factories.Therefore, the xy-plane is a square, whose side lengths are equal to the number of jobs n.The value of the knowledge base at position (x 0 , y 0 , z 0 ), where x 0 , y 0 ∈ 1, 2, • • • , n, x 0 = y 0 and z 0 ∈ {1, 2, • • • , F}, denotes the probability that both job x 0 and y 0 are assigned to the same factory z 0 while job x 0 is processed before job y 0 , while the value of position (x 0 , x 0 , z 0 ) means the probability that job x 0 is assigned to factory z 0 .
The knowledge base is initialized as Equation ( 19), where At the end of each generation, the knowledge base is updated according to the nondominated solutions.For each elite individual pop e in E(g), e ∈ 1, 2, • • • , n E , an incremental matrix δ e is calculated according to Equations ( 20) and (21) where and n E is the number of non-dominated solutions.
Finally, the knowledge base is updated via Equation ( 22) and normalized as Equation (23).
Through the operations above, the useful information of better-performed solutions is transformed from the discrete model into the continuous space, which is more suitable to work DE to its advantage.Moreover, the processing priority between different jobs and the information of the factory assignment are linked together.

Cooperative DE
As the most important step of DE, the mutation operator provides new individuals for the population and considerable research studies have proposed modifications in mutation strategies in the past two decades [43].A total of six most popular types of mutation operators of DE and their characteristics are represented in [44].As a result of the performance insufficiency of each single mutation strategy, the cooperation among several different strategies has been used to solve complicated scheduling problems [33,45,46].Since the mutation operator DE/rand/1 has the most exploration properties but very limited exploitation abilities, DE/best/1 has the most exploitation ability but may end up in premature convergence, while DE/current-to-best/1 has both the exploration and exploitation properties according to its utilization of the memory set of individuals, the cooperation among them may avoid being trapped in local optima and ensure the balance between the diversification and intensification when searching for the optimal solution.this paper, three differential mutation strategies, i.For each individual in Pop 1 (g), the DE/rand/1 strategy is carried out on KB g , for which three individuals are selected from Pop 1 (g) randomly.Next, the best of them is recorded as x r 0 according to the non-dominated sort, while the worst individual is named x r 2 , and the rest are denoted as x r 1 .Then, a variant matrix can be obtained through differential mutation, as described in Equations ( 24) and (25), where x, y ∈ {1, 2, . . ., n}, x = y, z ∈ {1, 2, • • • , F} and F m denote the mutation factors.
The DE/best/1 strategy is carried out on KB g for each individual in Pop 2 (g).A total of three individuals, including the best one in Pop 2 (g) and two other ones randomly selected, are selected and denoted as x best , x r 1 and x r 2 .Then, the differential mutation is carried out as Equations ( 24) and ( 25), in which the x r 0 is replaced by x best , thus the variant matrix ∆ 2 is received.
As for the DE/current-to-best/1, the current individual along with the best one in Pop 3 (g) and two randomly selected ones are chosen and recorded as x i , x best , x r 1 and x r 2 respectively.Then, the variant matrix is generated through Equations ( 26) and (27).

Crossover
The crossover operator constructs a new trial individual under the guidance of a target individual and a mutant individual [43].Most of the existing crossover operators of DE are binomial variant, which is essentially suitable for solving continuous optimization problems.However, as a discrete problem, job-shop scheduling problems have various limits on structuring individuals.Most of the existing research using DE to solve scheduling problems have paid attention to the modification of crossover strategies or the structure of individuals instead of alteration in both of them [47], which may cause mismatching between the problem and the algorithm.Thus, on the basis of the knowledge base, a problem-specific crossover operator is proposed as follows.
With ∆ d and KB g , an offspring matrix can be obtained through crossover, which is shown in Equation (28).
where CR is the crossover probability, ∆ d , d ∈ {1, 2, 3} indicates the three variant matrices provided by three DE strategies, and KB gd is the corresponding offspring matrix.Finally, an offspring individual is sampled from KB g1 according to the Roulette wheel selection strategy.The sampling operation mainly consists of two steps, choosing the factory for every job and sorting the jobs in each factory.In the first step, the certain factory f with max f KB gd (j, j, f ) is selected for each job j.Then, jobs assigned to the same factory are chosen one by one via the roulette wheel selection strategy according to the value of KB gd (j, j , f ); thus, jobs with higher values are more likely to be processed earlier.

Selection
The offspring generated through DE/rand/1, DE/best/1, and DE/current-to-best/1, denoted as Pop New1 (g), Pop New2 (g), and Pop New3 (g), respectively, are gathered together as Pop New (g), with a size of NP.Next, the newly generated population Pop New (g) and the current population Pop(g) are united and the best NP individuals are selected according to non-dominated sorting.During the selection, the chosen individuals belonging to Pop New1 (g), Pop New2 (g), and Pop New3 (g) are recorded as Num 1 , Num 2 , and Num 3 , respectively.Then, the weight vector is updated as Equation (29).
According to the descriptions of the three DE strategies, the minimum number of individuals for each group is four.In order to guarantee the feasibility of the cooperative DE, the lower bound of w 1 , w 2 , and w 3 in the weight vector is set as 0.1.In other words, the value of elements in w smaller than 0.1 will be improved to 0.1, while the other elements will be decreased in proportion.

Local Intensification
In order to further enhance the quality of solutions, a local intensification is designed and carried out on non-dominated individuals at the end of each generation.The main method of local intensification is described in Algorithm 4. For each non-dominated individual, the factories with the maximum and minimum completion times are first selected, denoted as the critical factory f c and the easiest factory f e , respectively.Next, a job assigned to f c is chosen randomly, which is then inserted into all possible positions in f e .After that, the objective values of each insertion are calculated and the best one is kept as a neighbor.Finally, the NP best individuals are selected from the union of the current population and neighbors.

Rescheduling Strategy
When the new order O 2 arrives, the following tasks are completed before carrying out the KCDE.

•
The state of jobs in O 1 at time t a are recorded, and the unprocessed jobs are counted.

•
The unprocessed jobs are then put with jobs in O 2 , together forming the new order.The factory assignments of unprocessed jobs are also recorded, which are used as the constraints during the rescheduling stage since the jobs cannot be transformed into other factories once assigned.

•
The available machine times are updated into the completion times of jobs, which are processed at time t a .

Framework of KCDE
The framework of the proposed KCDE is depicted in Figure 3. Three differential evolution strategies are employed cooperatively on the knowledge base, and a self-adaptive weight vector is designed to control the proportion of the application of three strategies according to the quality of the offspring generated by them.Moreover, several problemspecific strategies, such as hybrid initialization, update of the knowledge base, and local intensification, are implemented.The performance of such a properly designed algorithm in solving EDHFRP is highly expected.Update the weight vector.

Start
Initialize the knowledge base.
Update the knowledge base.

Experimental Settings
A considerable amount of instances are generated and employed to test the performance of KCDE in solving EDHFRP.Each instance has four decisive factors for its scale, including the number of factories F, the number of jobs in the first order n 1 , the number of jobs in the new order n 2 , and the number of stages s.In this paper, the value of F is collected from {3, 4, 5}, the value of n 1 is from {30, 50, 80}, while n 2 is selected from {20, 30, 50} and s is from {4, 5, 6}.Moreover, the processing time and energy consumption of the per-unit time is sampled from the instances provided by early research for similar problems [12,41].Moreover, 10 instances are generated for each combination of F, n 1 , n 2 , and s; thus, a total of 3 × 3 × 3 × 3 = 81 instances in 81 different scales are used in the following tests and comparisons.The arrival time of O 2 is set randomly, while instances with the same scales own the same t a .
Generally, more running time is needed in each iteration for a scheduling instance with higher complexity, while the operation is faster when it comes to an easy instance.In order to guarantee the reasonableness and fairness of the experiments on instances with different scales, the maximum running time is set according to the four decisive factors for the scale.Therefore, for each compared algorithm and testing algorithm, the maximum running time for the first order is set as 0.1 × F × n 1 × s seconds, while it is set as 0.1 × F × n 2 × s seconds after O 2 's arrival.To be fair, all algorithms were implemented through MATLAB2021b and conducted in the same computer environment with 12th Gen Intel Core i7-12700K, 3.61GHz, 32GB RAM, and Windows 10 OS.
Three commonly used metrics in the multi-optimization field, i.e., hyper-volume (HV), C metric (CM), and generational distance (GD) [48] are utilized to evaluate the performance of the algorithms, which focus on measuring the convergence and diversity of the Pareto set approximations.

Model Validation
In order to validate the mathematical model provided in Section 2.2, 10 small-scale instances are generated randomly, and the exact solver IBM ILOG CPLEX 12.9 is used to compare with the proposed algorithm.For each comparison, both CPLEX and KCDE run 10 times independently.In each run, the CPU time limit of CPLEX is set to 1 h while the stopping criterion of KCDE is set as mentioned above.Since the rescheduling solution is obtained on the basis of the schedule of the original order, and depends on the producing state at time t a , the comparison on each instance is divided into two parts.
In the first part, the original schedule O 1 is scheduled by both CPLEX and KCDE, the results (MS 1 and TEC) are compared.In the second part, the scheduling plan of the original order obtained by CPLEX is used for both the solver and the algorithm to solve the rescheduling problem; the results MS 1 , MS 2 , and TEC are compared again.
The results of the first comparison are shown in the left column of Table 1, from which it can be seen that small differences exist between the results obtained by CPLEX and KCDE, while KCDE achieves most of the better results.The right column in Table 1 provides the results of the second comparison, in which the objective values given by KCDE are better than those provided by CPLEX in all instances.Therefore, the mathematical model of EDHFRP is validated.The better value of each instance is marked in bold.

Parameter Setting
There exist three important parameters in the proposed KCDE: (1) the mutation factor in the differential mutation operation (F), (2) the probability of crossover in DE (CR), (3) and the weight factor used for updating the knowledge base (α).In order to inspect the effects of these parameters on the performance of KCDE and choose the best combination accordingly, the famous method, the design of the experiment (DOE), is carried out as follows.
In order to be more accurate, four levels are set for each parameter.According to the empirical conclusions provided by early studies on DE, the values of F and CR are set as F ∈ {0.5, 0.55, 0.6, 0.65} and CR ∈ {0.2, 0.25, 0.3, 0.35}.On the basis of the characteristics of EDHFRP and the design of the knowledge base, the value of α is set as α ∈ {0.75, 0.8, 0.85, 0.9}.Then, 16 different combinations of (F, CR, α) are tested on a group of 27 typical instances, including three small-sized instances, three moderate-sized instances, and three large-sized instances for each f ∈ {3, 4, 5}, according to the orthogonal array L 16 (4 3 ).
To be persuasive, the HV indicator is used to represent the performance of the algorithm, and KCDE runs 10 times on each instance independently with each combination.The average HV value is recorded and shown in Table 2.The combination of the minimum values of the objectives among all results for each instance is regarded as the lower bound, while the maximum ones are combined and regarded as the higher bound when calculating HV.Next, the average value of HV is calculated for each level of each parameter, and the certain level with the largest HV value is chosen.The average HVs and the rank of the parameters are shown in Table 3, where Delta is the difference between the largest and the smallest HV value for each parameter.The parameter with a larger Delta has a more significant effect on the performance of KCDE.Meanwhile, the main effects of the parameters are illustrated in Figure 4.The best value of each level is marked in bold.
It can be seen from Table 3 that CR has the greatest influence on KCDE when solving EDHFRP in different scales, followed by α and CR.As illustrated in Figure 4, CR denotes the proportion that the offspring learns from the trial individual rather than its parent, and a small value of CR may result in a rather slow evolutionary process.α determines how much the knowledge base learns from the elite individuals in each generation; from the trend depicted in Figure 4, it can be seen that a small α results in lower HV, which is caused by the loss of useful information.F controls the influence of parent vectors on the trial vector during differential mutation; a small F is more likely to decrease the diversity of the population, while the F of a too-large value will lead the population into slower convergence, as shown in Figure 4. Therefore, according to the analysis above and the results in Table 3, F = 0.6, CR = 0.35, and α = 0.9 are used in the following tests and experiments.

Effect of Hybrid Initialization
The proposed KCDE is compared to its variant KCDE-Ran, in which the initial population is generated randomly in order to investigate the effectiveness of hybrid initialization.For each instance, each algorithm runs 10 times independently.The average C(KCDE, KCDE-Ran) and C(KCDE-Ran, KCDE) values of the instances with the same scale are calculated and provided in Table 4. Furthermore, the p-value of the nonparametric test with a 95% confidence level is calculated and summarized for each scale (to analyze the significance of the difference between the two compared algorithms).The boxplots of all CM values are illustrated in Figure 5a.
It can be seen from Table 4 and Figure 5a that for each instance, C(KCDE, KCDE-Ran) is larger than C(KCDE-Ran, KCDE) while all p-values are smaller than 0.05.Therefore, it can be concluded that most of the non-dominated solutions obtained by KCDE are better than those provided by KCDE-Ran and the hybrid initialization considerably enhances the quality of the initial population (and, thus, the performance of KCDE).

Effect of Knowledge Base
In order to examine the effectiveness of the knowledge base, the comparisons between KCDE and its variant KCDE-nKB were carried out.In KCDE-nKB, each offspring is generated with the information carried out by the corresponding selected parents.Each algorithm runs 10 times independently, the average CM values of the instances with the same scale are calculated.Moreover, to analyze the significance of the difference between KCDE and KCDE-nKB, the sign test with a 95% confidence level was carried out, and the p-value of each scale is calculated and summarized in Table 5.The boxplots of all CM values are provided in Figure 5b.
From Table 5 and Figure 5b, it is obvious that C(KCDE,KCDE-nKB) is larger than C(KCDE-nKB,KCDE) for each instance and all p-values are lower than 0.05, which demonstrates that most solutions in the Pareto set approximation obtained by KCDE are better than those yielded by KCDE-nKB.Thus, it can be concluded that the utilization of the knowledge base can efficiently support KCDE by providing information carried out by elite individuals.

Effect of Local Intensification
The effectiveness of local intensification is investigated by comparing the proposed KCDE and KCDE-nLI, in which the local intensification part is absent.The average C(KCDE,KCDE-nLI) and C(KCDE-nLI, KCDE) values of the instances with the same scales were calculated and summarized in Table 6.Furthermore, to analyze the significance of the difference between the two algorithms, the sign test with the 95% confidence level was carried out and the p-values were calculated.The boxplots of all CM values are provided in Figure 5c.
As shown in Table 6 and Figure 5c, for each instance, C(KCDE,KCDE-nLI) is larger than C(KCDE-nLI, KCDE).Moreover, all of the p-values are lower than 0.05; it can be concluded that most elite solutions yielded by KCDE can dominate those obtained by KCDE-nLI.Thus, it can be concluded that the proposed local intensification operator has significant influence on enhancing the performance of KCDE.From Table 8 and Figure 7, one can see that the KCDE results in higher HV and lower GD in all instances.Thus, KCDE performs better than NSGA-II, MOEA/D, BCMA, and DCMA in solving EDHFRP since higher HV values indicate better convergence and diversity of the Pareto set approximations while lower GD values represent greater convergence properties.The Pareto set approximations of five algorithms in instances f3n30+30s5 and f3n80+20s5 are displayed in Figure 8, from which one can observe that the non-dominated solutions achieved by KCDE distribute more evenly than those obtained by the other four algorithms.As a result, it can be concluded that KCDE is more able to provide better solutions with shorter makespans and lower total energy consumption for EDHFRP.

Conclusions
This paper focuses on solving the energy-efficient distributed hybrid flow-shop rescheduling problem, minimizing the makespans of original and newly arrived orders as well as the total energy consumption, simultaneously.To this end, a knowledge-based cooperative differential evolution algorithm was designed.A group of broad instances with stochastic arrival times for the new order were generated.The effects of major parameters on the proposed algorithm were investigated and the importance of each problem-specific strategy was verified through extensive tests and experiments.The comprehensive comparisons and analysis demonstrate that KCDE is able to perform in a superior way in each instance.
In the future, the proposed algorithm can be extended to solve distributed rescheduling problems with more complicated constraints, such as fuzzy processing time, sequencedependent setup time, and limited buffers.Moreover, this work can be used to solve scheduling problems in other kinds of environments, including no-idle, no-wait, lotstreaming, and heterogeneous factories, by simply changing the model.Moreover, optimization objectives with more practical significance, including tardiness, time-of-use electricity costs, and transportation can be considered in the algorithm.

Figure 2 .
Figure 2. A feasible solution of an instance of EDHFRP; (a) A feasible solution before new job arrival; (b) The solution after order O 2 's arrival.

Algorithm 1 : 4 Calculate
Greedy NEH heuristic Input: Population size NP; Output: A total of NP/3 initial individuals.1 Generate NP/3 job sequences randomly; 2 for index = 1, 2, • • • , NP/3 do 3 for each job in the sequence do MS 1 f and TEC f of each factory f ; 5 Calculate the sum of ranks of MS 1 f and TEC f for each factory f ; 6 Insert the job into one of the possible position in the factory with the minimum summation value;

Algorithm 2 : 5 Calculate
NEH heuristic with biased optimization Input: Population size NP, initial importance vector w = [0.5, 0.5]; Output: A total of NP/3 initial individuals.1 Generate NP/3 job sequences randomly; 2 for index = 1, 2, • • • , NP/3 do 3 for each job in the sequence do 4 Insert the job into every possible positions in all factories; MS 1 and TEC of each insertion; 6 Calculate the sum of ranks of MS 1 and TEC for each insertion; 7 Select the position with the minimum summation value;

Algorithm 4 : 4 5 6 7 for each insertion do 8 Calculate the value of each objective; 9 end 10 Sort
Local intensification Input: The current population Pop, population size NP; Output: A new population Pop New . 1 Select all the non-dominated solutions from Pop through non-dominated sorting; 2 for e = 1, 2, • • • , N E do 3 Select the factory f c with the maximum completion time; Select the factory f e with the minimum completion time; Randomly select a job j c from factory f c and remove it; Insert job j c into all possible positions in factory f e ; the newly generated individuals in ascending order of each objective; 11 Calculate the summation of all rank; 12 Select the individual with the minimum summation as one neighbour; 13 end 14 Unite the current population Pop and the neighbours; 15 Select the best NP individuals from the union; Hybrid initialization.Calculate the value of NP 1 , NP 2 and NP 3 according to weight vector.Select the best NP individuals. of NP 1 , NP 2 and NP 3 according to weight vector.Select the best NP individuals.

Figure 6 .Table 7 .Figure 7 .Figure 8 .
Figure 6.Boxplots of C Metric of KCDE, NSGA-II, MOEA/D, BCMA and DCMA.Table 7.Average C Metric Values of KCDE and NSGA-II, MOEA/D, BCMA, DCMA.Dataset C 1,5 1 C 5,1 2 Energy consumption of machine i in processing mode at stage k in factory f .ECI f ,k,i Energy consumption of machine i in idle mode at stage k in factory f .
i Energy consumption of machine i per unit time at stage k in factory f in processing mode.EIEnergy consumption per unit time in idle mode.

Table 2 .
Orthogonal Arrays and HV Values.

Table 3 .
Average HVs and the Rank of Each Parameter.

Table 4 .
Average C Metric Values of KCDE and KCDE-Ran.

Table 5 .
Average C Metric Values of KCDE and KCDE-nKB.