A Meta-Model-Based Multi-Objective Evolutionary Approach to Robust Job Shop Scheduling

: In the real-world manufacturing system, various uncertain events can occur and disrupt the normal production activities. This paper addresses the multi-objective job shop scheduling problem with random machine breakdowns. As the key of our approach, the robustness of a schedule is considered jointly with the makespan and is deﬁned as expected makespan delay, for which a meta-model is designed by using a data-driven response surface method. Correspondingly, a multi-objective evolutionary algorithm (MOEA) is proposed based on the meta-model to solve the multi-objective optimization problem. Extensive experiments based on the job shop benchmark problems are conducted. The results demonstrate that the Pareto solution sets of the MOEA are much better in both convergence and diversity than those of the algorithms based on the existing slack-based surrogate measures. The MOEA is also compared with the algorithm based on Monte Carlo approximation, showing that their Pareto solution sets are close to each other while the MOEA is much more computationally e ﬃ cient.


Introduction
Production scheduling is of great significance in both scientific study and engineering applications [1][2][3][4][5]. Generally, the aim of the job shop scheduling problem (JSS) is to find a schedule that minimizes certain performance objective, given a set of machines and a set of jobs. However, in practice, the execution of a schedule is usually confronted with disruptions and unforeseen events, such as random machine breakdowns (RMDs), which make the actual performance of a schedule hard to predict. Against this background, we will focus on the multi-objective robust JSS under RMDs with the goal of optimizing the makespan and the robustness simultaneously.
In the last few decades, the JSS problems have been extensively studied, most of which have addressed the JSS with makespan as the objective [6][7][8], such as the hybrid genetic algorithm [9], the genetic algorithm [10] with search area adaptation [11], the global optimization technique which combines tabu search with the ant colony optimization [12], and the memetic algorithm conditioned on a limited set of human operators [13]. However, it is often assumed that the problem parameters about jobs and machines are known and deterministic. This makes it difficult to generate a good schedule for a real-world job shop which is subjected to various uncertainties [14], such as machine breakdowns, variable processing times, and due date changes. Mehta et al. [15] had classified the uncertainties in the practical manufacturing into three main categories: complete unknowns, suspicions about the future, and known uncertainties. Complete unknowns are those unpredictable events, e.g. a sudden strike, about which no a-priori information is available, while suspicions about the future arise from the intuition and experience of the human scheduler. On the other hand, known uncertainties are those events about which some information is available in advance, such as machine breakdowns [16][17][18] whose frequency and duration may be characterized by probability distributions. Under these uncertainties, a schedule will be difficult to execute as planned, and finally the actual performance of the schedule will deteriorate.
Recently, robust optimization has gained intensive interest [19], where most of existing studies focus on addressing the scheduling problems under known uncertainties. The robustness of a schedule indicates the ability of the schedule to preserve a specified level of solution quality in the presence of uncertainties [20], which is generally measured by the expected deviation of the performance from its initial performance under uncertainties [21]. Liu et al. [22] defined the robust schedule as a schedule that is insensitive to uncertainties, such as that a schedule may degrade its performance to a very small degree under disruptions. Thus, in addition to the makespan, the robustness of a schedule will also be taken as one of the objectives in the robust scheduling. Xiao et al. [23] addressed the stochastic JSS problem with uncertain processing times, and the robustness took the expected relative deviation between the realized makespan and the predictive makespan. Zuo et al. [24] considered both the expectation and standard deviation of the performance of a schedule. Ahmadi et al. [25] defined the robustness of a schedule as the expected deviation of starting and completion time of each job between preschedule and realized schedule under RMD.
With simultaneous consideration of the performance and the robustness of a schedule, the JSS under uncertainties holds a multi-objective nature. Usually, the two objectives are combined by the weight sum, and then the problem with two objectives will be transferred into a single-objective problem, such as that described in [26,27]. However, providing a wide range of solutions to decision-makers might be more useful [20], since decision-makers can make a better trade-off between the performance and the robustness for their schedules. The multi-objective evolutionary algorithm (MOEA), such as NSGA II [28][29][30], has been successfully solved the classic JSS without considering uncertainties [31]. In addition, Hosseinabadi, et al. [32] proposed a TIME_GELS algorithm that uses the gravitational emulation local search [33] for solving the multiobjective flexible dynamic job-shop scheduling problem. But, when the robustness is considered, a MOEA should further be able to solve the problems in the presence of uncertainties [34].
Because of the intractable complexity of JSS with uncertainties, it is difficult to evaluate the effects of uncertainties on a schedule, and thus the robustness is not available in a closed form. In this case, approximation methods should be applied for fitness evaluation in the MOEA. The simplest way is to employ the Monte Carlo method [35,36] to estimate the robustness, by averaging the objective values over a few randomly sampled uncertainty scenarios. The Monte Carlo method is based on random sampling and has been proven to be a powerful method for estimation [37][38][39]. However, it may cause potentially expensive fitness evaluations and reduce the computing efficiency. For this reason, the problem approximation that tends to replace the original statement of a problem by one which is simple but easier to solve, has been applied. Mirabi, et al. [40] simplified the machine breakdown by assuming the repair time is constant, while Liu et al. [41] assumed that all possible machine breakdowns during a scheduling horizon are aggregated as one. However, this may lead to a large mismatch between the model and the actual problem. In contrast, the meta-model which is an approximate function of the real fitness, also known as the surrogate measure, may be preferred. However, the design of a meta-model for the robustness is challenging. So far, the available surrogate measures for the robustness measured by the expected makespan delay (EMD) only include the average total slack time of operations in [42] and the sum of free slack times of operations in [26]. Since these surrogate measures ignored uncertainties, their estimation accuracy will be reduced [43].
In comparison with the existing research, the main contributions of our approach are as follows: • The robustness of a schedule is considered jointly with the makespan and is defined as EMD, for which a meta-model is designed by using a data-driven response surface method.
• A MOEA is proposed based on the meta-model, gaining excellent performance and efficiency.
The remainder of this paper is organized as follow. The multi-objective optimization model for the JSS under RMD is defined in the next section. In Section 3, a meta-model-based MOEA is proposed. The performance of the proposed algorithm is presented in Section 4, with comparison with the Monte Carlo method. In Section 5, we conclude the whole study.

Problem Definition
We consider the JSS problem with n jobs (J = J j j = 1, 2, . . . , n ) to be processed on m machines (M = {M i |i = 1, 2, . . . , m}). All jobs and machines are available at time zero. The processing of job j on machine i is called operation O ij , i ∈ [1, m], j ∈ [1, n], and its processing time p ij is constant and known. Each job includes m operations (O j = O ij i = 1, 2, . . . , m , j ∈ [1, n]) that must be processed in a specified sequence through each machine. A feasible schedule that specifies the starting and completion time of all operations should satisfy the constraints: (1) job splitting is not allowed; (2) an operation is not allowed to be preempted by the others; (3) each operation is performed only once on one machine; and (4) each machine performs only one operation at a time.
In this study, the operation-based machine breakdown model presented in [21] will be applied. More specifically, the machine breakdown is modeled by two parameters: the downtime required to repair the machine after its breakdown, and the breakdown probability during a time interval. The machine breakdown probability Pr ij when processing operation O ij can be calculated by Equation (1), where λ 0 is the machine failure rate of each machine.
The downtime D ij of a machine after a breakdown when processing operation O ij is modeled as an exponential distribution, as shown in Equation (2), where β 0 is the expectation of the downtime.
In the classic JSS problems without considering uncertainties, the aim of scheduling is generally to find a schedule that minimizes the makespan. The makespan of a schedule is equal to the maximum completion time of all operations in the schedule, which can be determined by Equation (3), where ct ij is the completion time of operation O ij .
However, the makespan of a schedule will be affected by RMDs in practice [16][17][18]. Since RMDs postpone the completion time of operations, the actual makespan C r max of a schedule will be delayed. As shown in Figure 1a, the makespan C 0 max of the schedule before execution is equal to 10. If a machine breakdown with one unit of downtime occurring on the operation O 21 , as shown in Figure 1b  According to the analysis above, the influence of machine breakdowns on the makespan of a schedule can be given by the makespan delay δ c in Equation (4).
Since machine breakdowns take place randomly, the makespan delay of a schedule will vary with the actual scenario of machine breakdowns, which makes the actual makespan very instability. The stochastic change of the actual makespan will reduce the performance of a schedule, such as that it may lead to the products cannot be delivered on time and lose the customer good will. Therefore, it is preferred that the makespan of a schedule is robust under RMDs. To measure the robustness of makespan, the EMD will further be applied, as the ∆ c shown in Equation (5).
However, a schedule with the minimum makespan is generally very compact, which means that it may be very sensitive to RMD and with a large EMD. Thus, the makespan C 0 max of a schedule will conflict with the EMD. Since different decision-makers have various preferences for the makespan and the makespan delay, it is worth providing a wide range of schedules for decision-makers to make the best trade-off. When consider the makespan and the EMD of a schedule at the same time, the JSS under RMD can be modeled as a multi-objective optimization problem. Let A be the set of precedence constraints (O ij , O kj ) that require job j to be processed on machine i before it is processed on machine k, the multi-objective optimization model can be provided as follows: Minimize: Subject to:

The Meta-Model Based MOEA
When solving the multi-objective optimization model in Section 2 by a MOEA, the primary task is to evaluate the fitness of each individual in a population. However, the EMD in Equation (5) cannot be analytically calculated for the intractable complexity of JSS under RMD. Although the commonly-used Monte Carlo simulation can be used to approximate the EMD, it will make the MOEA inefficient for it is very time-consuming to evaluate each single individual, especially for the problems with larger scales. In view of this, a meta-model-based MOEA will be proposed to solve the multi-objective optimization model for the robust JSS.

Framework of The Algorithm
The meta-model-based MOEA is designed according to the basic framework of the classic NSGA-II [28]. As shown in Figure 2, the algorithm begins with an initial population P 0 with N randomly generated individuals. Before executing the following genetic operators, the meta-model ∆ a c of the ∆ c will be constructed based on the initial population P 0 . Then, the selection, crossover, and mutation operators will be applied on the current population P k to generate new individuals and construct a combined population R k+1 . The fitness of individuals in the combined population R k+1 will first be evaluated by the makespan C 0 max and the proposed meta-model ∆ a c , and then the individual-based evolution control will be applied to update the fitness of some individuals. Finally, the next generation population P k+1 will be generated according to the ranks of individuals. When the maximum generation number is reached, the algorithm will stop and return the obtained Pareto solution set.

Meta-Model of The EMD
The meta-model for the EMD will be constructed based on the response surface methodology. As shown in Equation (12), this method applies a quadratic polynomialf (x) to approximate the function relation between the input x and the output y of a system, where, x is the input variant vector with v variables as shown in Equation (13), a 0 is the constant term, B is the coefficient vector of the linear term as shown in Equation (14), and C is the coefficient matrix of the quadratic term as shown in Equation (15).
However, a schedule cannot be directly taken as an input variant of the EMD, since it cannot be quantified. Therefore, to construct a meta-model by the response surface method for the EMD, the primary task is to extract features related to the EMD from the schedule. To this end, we will further analyze how RMDs affect the makespan of a schedule. As we all know, a feasible job shop schedule is decided by the process constraints and the resource constraints. As a result, machines may have some idle time during a schedule period, as shown in Figure 3a. In the classic JSS problems, we are devoted to reducing the idle time to minimize the makespan for improving the utilities of machines. However, when RMDs are considered, the idle time may be useful for an operation to control the influence of machine breakdowns on the makespan of a schedule and then improve the robustness of the makespan.
The available idle time of operations in a schedule can be classified into two types: the free slack time and the total slack time. The former is the time that an operation can be delayed without delaying the starting of its very next operations, while the latter is the difference between the earliest and latest starting times of an operation without delaying the makespan. Take the schedule in Figure  It is clear that not all operations have the free/total slack time. For an operation without slack time, the makespan of a schedule will be directly delayed, when an RMD takes place on it. As shown in Figure 3b, when a RMD with one unit of downtime takes places on the operation O 11 , the actual makespan C r max is changed to 11, which is directly delayed by one unit of time. However, when an RMD takes places on an operation with slack time, the makespan will not be delayed until the slack time of this operation is used up. As shown in Figure 3c, after a RMD with one unit of downtime takes place on the operation O 33 with two units of free slack time, the makespan is still equal to 10, for the free slack time of this operation is larger than the downtime of the breakdown. On the other hand, although the operation O 22 has no free slack time, the makespan can also be protected by the total slack time of the operation, as shown in Figure 3d.  According to the analysis above, it can be found that the makespan delay of a schedule depends on the machine breakdown level, the free slack time and total slack time of each operation. In view of this, we will extract the mathematical features for the schedule under RMDs from these three aspects: the RMDs, the set O y of operations with slack time and the set O n of operations without slack time. For the RMDs, the machine failure rate λ 0 and the expected downtime β 0 after a breakdown will be taken. For the operations in the set O y , all their processing time p ij , free slack time f s ij and total slack time ts ij will be taken. As for the operations in the set O n , only their processing time p ij will be taken.
However, it is practically impossible to consider all the processing time, free slack time, and total slack time of operations as the input variants of the EMD. To reduce the number of input variants, we will further generalize these basic features into some comprehensive features. Formally, the sum of processing time p y s and p n s will be used to represent the processing time of operations in the sets O y and O n , respectively. For the slack time, the average free slack time f s a and the average total slack time ts a of operations in the set O y will be applied.
Finally, the input variants of the EMD can be listed as follows: the machine failure rate λ 0 , the expected downtime β 0 , the sum of processing time p n s , the sum of processing time p y s , the average free slack time f s a , and the average total slack time ts a . Then, the input variant vector x of the EMD can be set as x = (x 1 , x 2 , x 3 , x 4 , x 5 , x 6 ) = (λ 0 , β 0 , p n s , p y s , f s a , ts a ), and then the meta-model ∆ a c of ∆ c can be defined by Equation (16).
Then, the coefficients should further be determined to finalize the meta-model ∆ a c . As shown in Algorithm 1, it takes the initial population P 0 and the input variant vector x as the inputs, and outputs the meta-model ∆ a c with the determined coefficients. First, a training data set D c which includes N data instances will first be generated based on the initial population P 0 . A data instance is composed of the values of the input variant vector x i and the corresponding ∆ i c . The values of the input variant vector x i can be determined once a schedule s i is generated based on the ith individual in the initial population P 0 . Since the EMD cannot be analytically calculated, it will be evaluated by the Monte Carlo approximation ∆ sim c as shown in Equation (17). After the training data set D c is constructed, the Multiple Linear Regression will be used to determine the coefficients of the meta-model ∆ a c for the EMD, where N s is the simulation times and C i max is the makespan of a schedule under the ith simulation.

Algorithm 1
The pseudo-code to finalize the meta-model.
Inputs: the initial population P 0 and the input variant vector x Outputs: the meta-model ∆ a c with the determined coefficients 1: Set the training data set D c = ∅; 2: Generate N s machine breakdown scenarios with the machine failure rate λ 0 and the expected downtime β 0 ; 3: for i = 1 to N 4: Generate the schedule s i based on the ith individual in P 0 ; 5: Determine the makespan C 0 max of the schedule s i by Equation (3); 6: Calculate the free slack time f s i j and the total slack time ts i j in schedule s i ; 7: Calculate the sum of processing time p n s and p y s ; 8: Calculate the average free slack time f s a and the average total slack time ts a ; 9: Determine the values x i of the input variant vector x = (λ 0 , β 0 , p n s , p y s , f s a , ts a ); 10: Determined the EMD ∆ i c = ∆ sim c by Equation (17); 11: Generate the ith data instance Update the training data set D c = D c ∪ I i ; 13: end for 14: Based on the training data set D c , apply the multiple linear regression to determine the coefficients of the meta-model ∆ a c in Equation (16); 15: Return the finalized meta-model ∆ a c .

Fitness Evaluation
To compare the fitness of different individuals, the makespan and the EMD of each individual in a population should be evaluated. Once a schedule is generated, the makespan can be directly determined by Equation (3). However, the EMD cannot be analytically calculated for the complexity of the JSS. Although it can be effectively approximated by the time-consuming Monte Carlo approximation in Equation (17), the efficiency of the algorithm will be significantly reduced. In view of this, we will apply the proposed meta-model ∆ a c in Equation (16) to approximate the EMD. The basic motivation for using the meta-model in the fitness evaluation is to reduce the number of expensive fitness evaluations without degrading the quality of the obtained optimal solution.
Based on the proposed meta-model ∆ a c , Algorithm 2 can be used to provide the fitness set represents the fitness of the ith individual in the combined population P k+1 with the makespan F i k+1 (1) = C 0 max (s i ) and the EMD F i k+1 (2) = ∆ a c (s i ).

Algorithm 2 Fitness evaluation for the combined population
Inputs: the combined population R k+1 Outputs: the fitness set F k+1 of the individuals in R k+1 1: Set the fitness set F k+1 = ∅; 2: for i = 1 to 2N 3: Select the ith individual chm i from the population R k+1 ; 4: Generate the schedule s i based on the individual chm i ; 5: Evaluate the makespan C 0,i max of the schedule s i by Equation (3); 6: Get machine failure rate λ 0 and expected downtime β 0 ; 7: Calculate the free slack time f s i j and the total slack time ts i j in the schedule s i ; 8: Calculate the sum of processing time p n s and p y s ; 9: Calculate the average free slack time f s a and the average total slack time ts a ; 10: Determine the values x i of the input variant vector x = (λ 0 , β 0 , p n s , p y s , f s a , ts a ) 11: Evaluate the EMD by ∆ a c (s i ) in Equation (16) with the values of x i ; 12: Update the fitness set F k+1 = F k+1 ∪ F i k+1 = F k+1 ∪ (C 0 max (s i ), ∆ a c (s i )); 13: end for 14: Return the fitness set F k+1 ;

Individual-Based Evolution Control
Generally, the approximate model is assumed to be of high fidelity and, therefore, the real fitness function will be not at all used in the evolution [20]. However, an evolutionary algorithm using meta-models without controlling the evolution using the real fitness function can run the risk of an incorrect convergence [28]. For this reason, the meta-model is combined with the real fitness function in our algorithm, which is often known as evolution control or model management.
As shown in Algorithm 3, the individual-based evolution control framework will be applied, which chooses the best individuals according to the pre-evaluation using the meta-model ∆ a c for reevaluation using the real fitness function. For this purpose, the fitness set F k+1 will first be ranked by the fast non-dominated sorting. And then, the individuals with the rank rank(F i k+1 ) = 1 will further be reevaluated by the Monte Carlo approximation ∆ sim c in Equation (17). In addition, to avoid the unnecessary simulation computational time, the repeated individuals will only be evaluated once.

Algorithm 3 Individual-based evolution control framework
Inputs: the fitness set F k+1 of the combined population R k+1 Outputs: the modified fitness setF k+1 1: Generate N s scenarios with the machine failure rate λ 0 and the expected downtime β 0 ; 2: Rank the fitness set F k+1 by the fast non-dominated sorting; 3: Sort the fitness set F k+1 in the ascending lexicographic order of the rank, the makespan and the EMD; 4: SetF k+1 = F k+1 ; 5: for i = 1 to 2N 6: if rank(F i k+1 ) > 1 7: Update the fitnessF

Evolutionary Operators
In our algorithm, the preference list representation is applied to code the chromosomes. The chromosome built by this coding method is made up of m substrings corresponding to m machines. Every substring is a preference list of n jobs on the corresponding machine. Supposing the chromosome is [ (2 3 1) (1 3 2) (2 1 3)], the substring (2 3 1) is the preference list for machine 1, the substring (1 3 2) for machine 2 and the substring (2 1 3) for machine 3. When decoding, the job the first to appear in every precedence list will be selected firstly. If a selected operation meets the process constraint, the operation will be scheduled, and then it is removed from corresponding preference list. If there are more than one operation can be scheduled, then select one randomly.
The main genetic operators include selection, crossover and mutation. The usual binary tournament selection is used to select parent individuals for generating child solutions. Namely, randomly select two individuals from the population, and choose one of them with better fitness for the subsequent genetic operators. In the crossover operator, the substring crossover which exchanges the substrings of parents between two randomly selected machine numbers is applied. For the mutation operator, the swap-mutation operator to a randomly selected substring is applied.
Before updating the next generation population, the fast non-dominated sorting approach is applied to ranking the solutions in the combined population. Then, the population will be updated by choosing the individuals in the order of their ranks. Since all the previous and current population members are included in the combined population, the elitism can also be ensured.

Experimental Analysis
In this section, the performance of the meta-model in evaluating the EMD will first be presented, and then the meta-model-based MOEA will be used to solve the robust JSS problem.

Experiment Setting
The algorithm is implemented using C++ and run on a 2.8 GHz PC with an Intel Pentium dual-core CPU and 2 GB of RAM. The parameters are listed as follows: the population size is 1024; the generation number is 64; the crossover rate is 0.95; the mutation rate is 0.05; the machine breakdown ratio is 0.005; the expected downtime is 20; and the simulation times are 600.
In the literature, many benchmark problems have been generated by different researchers to test the performance of different algorithms, which are also very useful for this research for they include a wide range of problem instances. In this study, the problem instances La01-La40 with sizes from 10 × 5 to 30 × 10 in the benchmark problem set LA (Lawrence in 1984) and the problem instances Ta01-Ta40 with sizes from 15 × 15 to 30 × 15 in the benchmark problem set TA (Taillard in 1994) will be applied. In total, there are 80 benchmark problem instances will be used to test the performance of the proposed algorithm.

Evaluation Performance of The Proposed Mete-Model
To distinguish the robustness of different schedules, an effective meta-model must have high evaluation accuracy and perform a strong linear correlation to the real value of the robustness. To show the accuracy of the proposed meta-model ∆ a c in evaluating the EMD, the average χ in Equation (18) and standard variance σ(χ) in Equation (19) of the absolute relative deviation χ(∆ a c , ∆ sim c ) from the Monte Carlo approximation ∆ sim c will be applied. In addition, a correlation study will be conducted using IBM SPSS. For each test problem, the R 2 statistic and the significance level Sig. of the linear model ANOVA are recorded. The value of R 2 is used to measure the fitting degree of the meta-model, while the significance level Sig. is used to test whether there is a significant linear correlation between the meta-model and the EMD. Therefore, a good meta-model should be with a large value of R 2 and a small value of Sig. for each test problem.
The experimental results have been processed and recorded in Tables 1 and 2 for the problem sets LA and TA, respectively. It can be found that the maximum and minimum values of χ for all problems in LA are equal to 0.041 and 0.020, respectively. And, the maximum and minimum values of χ for all problems in TA are equal to 0.026 and 0.017, respectively. That is, the values of χ for all test problems are less than 0.05 in average. Therefore, the values of the meta-model are very close to that of the Monte Carlo approximation, which indicate that the proposed meta-model have high accuracy in evaluating the EMD for the JSS under RMD. On the other hand, the maximum and minimum values of σ(χ) for all problems in LA are equal to 0.031 and 0.015, respectively. The maximum and minimum values of σ(χ) for all problems in TA are equal to 0.020 and 0.013, respectively. All these results show that the meta-model have a small variance in the absolute relative deviation, which indicate that the performance of the meta-model is also robust in evaluating the EMD. The results can also be clearly presented by the quartile graphs of the absolute relative deviation χ(∆ a c , ∆ sim c ), as shown in Figures 4 and 5. In Figure 4, the maximum value of the absolute relative deviations χ(∆ a c , ∆ sim c ) for all problems in LA is about 0.19, but more than 75% of the values are less than 0.06 for all the problems in the problem set LA. Especially, when the problem scale is larger, such as the problems LA21-LA40, more than 75% of the values of χ(∆ a c , ∆ sim c ) are less than 0.04. As for the problem set TA in Figure 5, the maximum value of the absolute relative deviation χ(∆ a c , ∆ sim c ) for all problems is only about 0.12, and more than 75% of the values are less than 0.04 for all the test problems. Therefore, it can be concluded that the proposed meta-model has a very small estimation error for the EMD.
On the other hand, the results of the linear model ANOVA have also been provided in Tables 1  and 2. For the results in Table 1, except for the problems La02, La03, La09, and La20, we can find that all the values of R 2 are larger than 0.70. Especially for the problems La21-La40 with larger problem scales, the average of R 2 even reaches to 0.76. All the values of R 2 are larger than 0.70 for the problems in TA and the average value is about 0.76 as shown in Table 2. In addition, for all the problems in LA and TA, the significance level Sig. is less than 0.01. All these results show that the proposed meta-model have a significant linear correlation with the expected makespan.   In summary, the proposed meta-model ∆ a c have high evaluation accuracy and holds a strong linear correlation to the EMD. This means that the meta-model ∆ a c is able to effectively distinguish the robustness of different schedules in the evolutionary algorithm.

Optimization Performance of The Proposed Algorithm
In this section, the performance of the proposed meta-model (MM)-based MOEA in optimizing the makespan and the EMD for the JSS under RMD will be presented. To show the performance of the algorithm, the Monte Carlo approximation in Equation (17)-based MOEA (MC) will be applied. In addition, the proposed meta-meta will also be compared with the existing surrogate measures, and thus the results on the MOEAs with the average total slack time (SM1) in [42] and the sum of free slack time (SM2) in [26] will also be provided. By implementing these algorithms with various approximations of EMD, four Pareto solution sets can be obtained for each problem.
To investigate the performance of MOEAs, many metrics have been developed and applied in the related research [44]. Since a single metric can only provide some specific but incomplete of performance, to comprehensively evaluate the performance of the proposed algorithm, both the average distance and the number of distinct choices will be used in our experiments to measure the convergence and diversity of the algorithm, respectively.
Average distance metric A d in Equation (20) evaluates the closeness of the obtained Pareto solution set PF find to the true Pareto solution set PF true , where d(z i , a j ) denotes the Euclidean distance between z i in PF find and all points a j in PF true .
Number of distinct choices N µ in Equation (21) focuses on the distribution of solutions, which defines the number of distinct choices for a pre-specified value of µ, 0 < µ < 1. In this metric, an m-dimensional objective space will be divided into 1/µ m number of small grids, where any solutions within the same grid are considered similar to one another. If there are individuals in the obtained Pareto set PF find that fall into the region T(l m , . . . , l 2 , l 1 ), NT(l m , . . . , l 2 , l 1 ) is equal to one, otherwise zero. In our experiments, the value of µ for the metric N µ is taken as 0.05.
Since the NP-hard nature of the JSS under RMD, the true Pareto set PF true is not available for all test problems. In view of this, the approximate Pareto solution set which is provided by all comparison algorithms will be used to represent the true one. That is, under the obtained Pareto fronts (PF 1 find , PF 2 find , . . . , PF n p find ) by different algorithms for a test problem, the true Pareto solution set can be approximated by Equation (22), where a j ≺ b i implies that a j dominates b i . Besides, in order to reduce the scale difference between different objectives, all Pareto fronts will be normalized by PF find = (PF find − PF min true )/(PF max true − PF min true ) based on the maximum and minimum of objectives of the true Pareto set PF true .
The results on the metric A d of the algorithms MC, MM, SM1, and SM2 have been provided in Tables 3 and 4 for the problem sets LA and TA, respectively. The results show that the algorithms SM1 and SM2 have the similar performance in the convergence, for the values of A d of the algorithms SM1 and SM2 are always close to each other. For example, in Table 3, their average values of A d in the problems La01-La20 are 1.27 and 1.26, and the average values of A d in the problems La21-La40 are 0.21 and 0.21, respectively. The similar results can also be found in Table 4, the average values of A d in the problems Ta01-Ta20 are 0.21 and 0.24, while they are 0.22 and 0.23 in the problems Ta21-Ta40.
By comparison, the proposed algorithm MM performs better in the convergence than the algorithms SM1 and SM2. This is because that, except for the problems La39, Ta14, and Ta39, all the values of A d for the algorithm MM in the problem sets LA and TA are less than that of the algorithms SM1 and SM2. In average, the values of A d for the algorithm MM in the problems La01-La20 and La21-La40 are 0. 16  On the other hand, the results also indicate that the convergence of the algorithm MM can even be very close to that of the algorithm MC. As shown in Table 3, the results show that the proposed algorithm MM can have the best convergence in the problems La04, La05, La07, La10, La12, La13, La14, La22, La24, La26, La27, and La31 from the problem set LA and Ta06, Ta09,Ta19, Ta23, Ta24, Ta37, and Ta37 from the problem set TA. In addition, the average of A d for the algorithm MM in the problems La01-La21 with smaller problem scales is 0.16, which is 0.12 larger than that of the algorithm MC. But, in the problems with larger problem scales, such as the problems La21-La40 and Ta01-Ta40, the average of A d for the algorithm MM is only 0.06 larger that of the algorithm MC. Therefore, we conclude that the proposed algorithm MM is better than the algorithms SM1 and SM2 and similar to the algorithm MC in convergence.  The results on the metric N µ of the algorithms MC, MM, SM1 and SM2 are presented in Tables 5  and 6 for the problem sets LA and TA, respectively. The results show that the algorithms SM1 and SM2 also have the similar performance in the diversity. From the results in Table 5, we can found that the average values of N µ for the algorithms SM1 and SM2 in the problems La01-La20 are 4.7 and 5.2, and the average values of A d for the problems La21-La40 are 9.3 and 10.5, respectively. That is, the average difference of A d between the algorithms SM1 and SM2 is only about 1.0 in average. The similar results can also be found in Table 6, the average values of N µ for the problems Ta01-Ta20 are 10.2 and 9.7, while they are 8.9 and 8.4 in the problems Ta21-Ta40.  Compared with the algorithms SM1 and SM2, the proposed algorithm MM performs better in diversity. This is because that the values of A d for the algorithm MM are larger than that of the algorithms SM1 and SM2 for most of the problems in the problem sets LA and TA as shown in Tables 5  and 6. In addition, the results have also shown the algorithm MM have the similar performance in diversity to that of the algorithm MC. For example, the average values of A d for the algorithms MM and MC in the problems La01-La20 are about 8.1 and 8.3, while the average values of A d are 15.3 and 13.7 in the problems La21-La40, respectively. Similar results can be found in Table 6 for the problem set TA. Therefore, we can conclude that the proposed algorithm MM is very close to the algorithm MC in diversity, but it is much better than the algorithms SM1 and SM2.
Taking the problems instances La06, La15, Ta15, and Ta36 as examples, the performance of the  algorithms MC, MM, SM1, and SM2 can also be clearly illustrated by the Pareto fronts. As shown in Figure 6, the Pareto front of the algorithm MM is very close to that of the algorithm MC, while the Pareto fronts of the algorithms SM1 and SM2 are very far from that of the algorithm MC. On the other hand, the number of solutions in the Pareto front of the algorithm MM is approximately equal to that of the algorithm MC, while the number of solutions in the Pareto fronts of the algorithms SM1 and SM2 are much less in comparison. However, as indicated earlier, the algorithm based on the Monte Carlo approximation will be very time-consuming. To investigate computational efforts for each algorithm, the average T a and the relative value T r of CPU time have been recorded, where T r is the ratio of the average T a of an algorithm (MC, MM, SM1, or SM2) to that of the algorithm MC. In this experiment, the simulation times for the Monte Carlo approximation are set as 30, which is generally the required minimum number of samples in statistical estimation [20]. As the results shown in Table 7, the algorithms SM1 and SM2 always consume the least time, while the algorithm MC consumes the most. By comparison, the time consumption of the proposed algorithm MM is almost equal to that of the algorithms SM1 and SM2, but it is much less than that of the algorithm MC. What is more, the time consumption of the algorithm MC increases significantly with the problem scale. For example, the average T a of the algorithm MC in the problems La01-La05 is 66, which is close to that of the algorithm MM with the value 53. But, in the problems Ta31-Ta40, the time consumption of the algorithm MC is about 8 times of that of the algorithm MM. In more complex problems or uncertain scenarios, a larger sample size may be needed for the algorithm MC, which will result in a computational time that is unacceptable.

Conclusions
In this study, we have addressed the robust JSS with RMDs, in which the makespan and the EMD are considered simultaneously. To improve the efficiency of the MOEA, a meta-model has been constructed by using the data-driven response surface method. Then, with the individual-based evolution control, the meta-model-based MOEA has been proposed to solve this problem. The results have shown that regarding the convergence and diversity, the proposed algorithm yields better Pareto solution sets than the algorithms with the existing slack time-based surrogate measures. Moreover, the meta-model has high accuracy in evaluating the EMD similar to the Monte Carlo approximation. Overall, the proposed meta-model-based MOEA can effectively and efficiently solve the robust JSS with RMDs. Input vector x = (λ 0 , β 0 , p n s , p y s , f s a , ts a ) I i A data instance Training data set s i Schedule of ith individual in R k+1 F i k+1 Fitness of the ith individual in R k+1 F k+1 Fitness set of the population R k+1