Chaotic Multi-Objective Simulated Annealing and Threshold Accepting for Job Shop Scheduling Problem

: The Job Shop Scheduling Problem (JSSP) has enormous industrial applicability. This problem refers to a set of jobs that should be processed in a speciﬁc order using a set of machines. For the single-objective optimization JSSP problem, Simulated Annealing is among the best algorithms. However, in Multi-Objective JSSP (MOJSSP), these algorithms have barely been analyzed, and the Threshold Accepting Algorithm has not been published for this problem. It is worth mentioning that the researchers in this area have not reported studies with more than three objectives, and the number of metrics they used to measure their performance is less than two or three. In this paper, we present two MOJSSP metaheuristics based on Simulated Annealing: Chaotic Multi-Objective Simulated Annealing (CMOSA) and Chaotic Multi-Objective Threshold Accepting (CMOTA). We developed these algorithms to minimize three objective functions and compared them using the HV metric with the recently published algorithms, MOMARLA, MOPSO, CMOEA, and SPEA. The best algorithm is CMOSA (HV of 0.76), followed by MOMARLA and CMOTA (with HV of 0.68), and MOPSO (with HV of 0.54). In addition, we show a complexity comparison of these algorithms, showing that CMOSA, CMOTA, and MOMARLA have a similar complexity class, followed by MOPSO.


Introduction
The Job Shop Scheduling Problem (JSSP) has enormous industrial applicability. This problem consists of a set of jobs, formed by operations, which must be processed in a set of machines subject to constraints of precedence and resource capacity. Finding the optimal solution for this problem is too complex, and so it is classified in the NP-hard class [1,2]. On the other hand, the JSSP foundations provide a theoretical background for developing efficient algorithms for other significant sequencing problems, which have many production systems applications [3]. Furthermore, designing and evaluating new algorithms for JSSP is relevant not only because it represents a big challenge but also for its high industrial applicability [4].
There are several JSSP taxonomies; one of which is single-objective and multi-objective optimization. The single-objective optimization version has been widely studied for many years, and the Simulated Annealing (SA) [5] is among the best algorithms. The Threshold Accepting (TA) algorithm from the same family is also very efficient in this area [6]. In contrast, in the case of Multi-Objective Optimization Problems (MOOPs), both algorithms for JSSP and their comparison are scarce.
Published JSSP algorithms for MOOP include only a few objectives, and only a few performance metrics are reported. However, it is common for the industrial scheduling requirements to have several objectives, and then the Multi-Objective JSSP (MOJSSP) becomes an even more significant challenge. Thus, many industrial production areas require the multi-objective approach [7,8].
In single-objective optimization, the goal is to find the optimal feasible solution of an objective function. In other words, to find the best value of the variables which fulfill all the constraints of the problem. On the other hand, for MOJSSP, the problem is to find the optimum of a set of objective functions f 1 (x), f 2 (x) . . . f n (x) depending on a set of variables x and subject to a set of constraints defined by these variables. To find the optimal solution is usually impossible because fulfilling some objective functions may not optimize the other objectives of the problem. In MOOP, a preference relation or Pareto dominance relation produces a set of solutions commonly called the Pareto optimal set [9]. The Decision Makers (DMs) should select from the Pareto set the solution that satisfies their preferences, which can be subjective, based on experience, or will most likely be influenced by the industrial environment's needs [10]. Therefore, the DM needs to have a Pareto front that contains multiple representative compromise solutions, which exhibit both good convergence and diversity [11].
In the study of single-objective JSSP, many algorithms have been applied. Some of the most common are SA, Genetic Algorithms (GAs), Tabu Search (TS), and Ant Systems (ASs) [12]. In addition, as we mention below, few works in the literature solve JSSP instances with more than two objectives and applying more than two metrics to evaluate their performance. Nevertheless, for MOJSSP, the number of objectives and performance metrics remains too small [8,[13][14][15]. The works of Zhao [14] and Mendez [8] are exceptions because the authors have presented implementations with two or three significant objective functions and two performance metrics. Moreover, SA and TA have shown to be very efficient for solving NP-hard problems. Thus, this paper's motivation is to develop new efficient SA algorithms for MOJSSP with two or more objective functions and a larger number of performance metrics.
The first adaptation of SA to MOOP was an algorithm proposed in 1992, also known as MOSA [16]. An essential part of this algorithm is that it applies the Boltzmann criterion for accepting bad solutions, commonly used in single-objective JSSP. MOSA combines several objective functions. The single-objective JSSP optimization with SA algorithm and MOSA algorithm for multi-objective optimization is different in several aspect related to determining the energy functions, using and generating new solutions, and measuring their quality as is well known, these energy functions are required in the acceptance criterion. Multiple versions of MOSA have been proposed in the last few years. One of them, published in 2008, is AMOSA, that surpassed other MOOP algorithms at this time [17]. In this work, we adapt this algorithm for MOJSSP. TA [6] is an algorithm for single-objective JSSP, which is very similar to Simulated Annealing. These two algorithms have the same structure, and both use a temperature parameter, and they accept some bad solutions for escaping from local optima. In addition, these algorithms are among the best JSSP algorithms, and their performance is very similar. Nevertheless, for MOJSSP, a TA algorithm has not been published, and so for obvious reason, it was not compared with the SA multi-objective version.
The two new algorithms presented in this paper for JSSP are Chaotic Multi-Objective Simulated Annealing (CMOSA) and Chaotic Multi-Objective Threshold Accepting (CMOTA). The first algorithm is inspired by the classic MOSA algorithm [17]. However, CMOSA is different in three aspects: (1) for the first time it is designed specifically for MOJSSP, (2) it uses an analytical tuning of the cooling scheme parameters, and (3) it uses chaotic perturbations for finding new solutions and for escaping from local optima. This process allows the search to continue from a different point in the solution space and it contributes to a better diversity of the generated solutions. Furthermore, CMOTA is based on CMOSA and Threshold Accepting, and it does not require the Boltzmann distribution. Instead, it uses a threshold strategy for accepting bad solutions to escape from local optima. In addition, a chaotic perturbation function is applied.
In this paper, we present two new alternatives for MOJSSP, and we consider three objective functions: makespan, total tardiness, and total flow time. The first objective is very relevant for production management applications [7], while the other two are critical for enhancing client attention service [23]. In addition, we use six metrics for the evaluation of these algorithms, and they are Mean Ideal Distance (MID), Spacing (S), Hypervolume (HV), Spread (∆), Inverted Generational Distance (IGD), and Coverage (C). We also apply an analytical tuning parameter method to these algorithms. Finally, we compare the achieved results with those obtained with the JSSP algorithm cited below in [8,14].
The rest of the paper is organized as follows. In Section 2, we make a qualitative comparison of related MOJSSP works. In Section 3, we present MOJSSP concepts and the performance metrics that were applied. Section 4 presents the formulation of MOJSSP with three objectives. The proposed algorithms, their tuning method, and the chaotic perturbation are also shown in Section 5. Section 6 shows the application of the proposed algorithms to a set of 70, 58, and 15 instances. Finally, the results are shown and compared with previous works. In Section 7, we present our conclusions.

Related Works
As mentioned above, in single-objective optimization, the JSSP community has broadly investigated the performance of the different solution methods. However, the situation is entirely different for MOJSSP, and there is a small number of published works. In 1994, an analysis of SA family algorithms for JSSP was presented [24]; two of them were SA and TA, which we briefly explain in the next paragraph. These algorithms suppose that the solutions define a set of macrostates of a set of particles, while the objective functions' values represent their energy, and both algorithms have a Metropolis cycle where the neighborhood of solutions is explored. In single-objective optimization, for the set of instances used to evaluate JSSP algorithms, SA obtained better results than TA. Furthermore, a better solution than the previous one is always accepted, while a worse solution may be accepted depending on the Boltzmann distribution criterion. This distribution is related to the current temperature value and the increment or decrement of energy (associated with the objective functions) in the current temperature value. In the TA case, a worse solution than the previous one may be accepted using a criterion that tries to emulate the Boltzmann distribution. This criterion establishes a possible acceptance of a worse solution when the decrement of energy is smaller than a threshold value depending on the temperature and a parameter γ that is very close to one. Then at the beginning of the process, the threshold values are enormous because they depend on the temperatures. Subsequently, the temperature parameter is gradually decreased until a value close to zero is achieved, and then this threshold is very small.
In 2001, a Multi-Objective Genetic Algorithm was proposed to minimize the makespan, total tardiness, and the total idle time [25]. The proposed methodology for JSSP was assessed with 28 benchmark problems. In this publication, the authors randomly weighted the different fitness functions to determine their results.
In 2006, SA was used for two objectives: the makespan and the mean flow time [26]. This algorithm was called Pareto Archived Simulated Annealing (PASA), which used the Simulated Annealing algorithm with an overheating strategy to escape from local optima and to improve the quality of the results. The performance of this algorithm was evaluated with 82 instances taken from the literature. Unfortunately, this method has not been updated for three or more objective functions.
In 2011, a two-stage genetic algorithm (2S-GA) was proposed for JSSP with three objectives to minimize the makespan, total weighted earliness, and total weighted tardiness [13]. In the first stage, a parallel GA found the best solution for each objective function. Then, in the second stage, the GA combined the populations, which evolved using the weighted aggregating objective function.
Researchers from the Contemporary Design and Integrated Manufacturing Technology (CDIMT) laboratory proposed an algorithm named Improved Multi-Objective Evolutionary Algorithm based on Decomposition (IMOEA/D) to minimize the makespan, tardiness, and total flow time [14]. The authors experiment with 58 benchmark instances, and they use the performance metrics Coverage [27] and Mean Ideal Distance (MID) [28] to evaluate their algorithm. We notice in Table 1, studies with two or three objectives, but they do not report any metric. On the other hand, IMOEA/D stands out from the rest of the literature, not only because the authors reported good results but also because they considered a more significant number of objectives, and they applied two metrics.
In 2008, the AMOSA algorithm based on SA for several objectives was proposed [17]. In this paper, the authors reported that the AMOSA algorithm performed better than some MOEA algorithms, one of them NSGA-II [29]. They presented the main Boltzmann rules for accepting bad solutions. Unfortunately, a MOJSSP with AMOSA and with more than two objectives has not been published.
In 2017, a hybrid algorithm between an NSGA-II and a linear programming approach was proposed [15]; it was used to solve the FT10 instance of Taillard [30]. This algorithm minimized the weighted tardiness and energy costs. To evaluate the performance, the authors only used the HV metric.
In 2019, MOMARLA was proposed, a new algorithm based on Q-Learning to solve MOJSSP [8]. This work provided flexibility to use decision-maker preferences; each agent represented a specific objective and used two action selection strategies to find a diverse and accurate Pareto front. In Table 1, we present the last related studies for MOJSSP and the proposed algorithms.

Algorithm
Objectives Metrics

Multi-Objective Optimization
In a single-objective problem, the algorithm finishes its execution when it finds the solution that optimizes the objective function or a very close optimal solution. However, for Multi-Objective Optimization, the situation is more complicated since several objectives must be optimized simultaneously. Then, it is necessary to find a set of solutions optimizing each of the objectives individually. These solutions can be contrasting because we can obtain the best solution for an objective function that is not the best for other objective functions.

Concepts
Definitions of some concepts of Multi-Objective Optimization are shown below. Pareto Dominance: In general, for any optimization problem, solution A dominates another solution B if the following conditions are met [31]: A is strictly better than B on at least one objective, and A is not worse than B for any objective function.
Non-dominated set: In a set of P solutions, the non-dominated solutions P1 is integrated by solutions that accomplish the following conditions [31]: any pair of P1 solutions must be non-dominated (one regarding the other), and any solution that does not belong to P1 is dominated by at least one member of P1.
Pareto optimal set: The set of non-dominated solutions of the total search space. Pareto front: The graphic representation of the non-dominated solutions of the multiobjective optimization problem.

Performance Metrics
In an experimental comparison of different optimization techniques or algorithms, it is always necessary to have the notion of performance. In the case of Multi-Objective Optimization, the definition of quality is much more complicated than for single-objective optimization problems because the multi-objective optimization criteria itself consists of multiple objectives, of which, the most important are: 1. To minimize the distance of the resulting non-dominated set to the true Pareto front. 2. To achieve an adequate distribution (for instance, uniform) of the solutions is desirable. 3. To maximize the extension of the non-dominated front for each of the objectives.
In other words, a wide range of values must be covered by non-dominated solutions.
In general, it is difficult to find a single performance metric that encompasses all of the above criteria. In the literature, a large number of performance metrics can be found. The most popular performance metrics were used in this research and are described below: Mean Ideal Distance: Evaluates the closeness of the calculated Pareto front (PF calc ) solutions with an ideal point, which is usually (0, 0) [28].
where c i = f 2 1,i + f 2 2,i + f 2 3,i and f 1,i , f 2,i , f 3,i are the values of the i-th non-dominated solution for their first, second, and third objective function, and Q is the number of solutions in the PF calc .
Spacing: Evaluates the distribution of non-dominated solutions in the PF calc . When several algorithms are evaluated with this metric, the best is that with the smallest S value [32].
where d i measures the distance in the space of the objective functions between the i-th solution and its nearest neighbor; that is the j-th solution in the PF calc of the algorithm, Q is the number of the solutions in the PF calc ,d is the average of the d i , that isd = ∑ Hypervolume: Calculates the volume in the objective space that is covered by all members of the non-dominated set [33]. The HV metric is measured based on a reference point (W), and this can be found simply by constructing a vector with the worst values of the objective function.
where v i is a hypercube and is constructed with a reference point W and the solution i as the diagonal corners of the hypercube [31]. An algorithm that obtains the largest HV value is better. The data should be normalized by transforming the value in the range [0, 1] for each objective separately to perform the calculation. Spread: This metric was proposed to have a more precise coverage value and considers the distance to the (extreme points) of the true Pareto front (PF true ) [29].
where d e k measures the distance between the "extreme" point of the PF true for the k-th objective function, and the nearest point of PF calc , d i corresponds to the distance between the solution i-th of the PF calc , while its nearest neighbor,d corresponds to the average of the d i and M is the number of objectives.
Inverted Generational Distance: It is an inverted indicator version of the Generational Distance (GD) metric, where all the distances are measured from the PF true to the PF calc [1].
where T = {t 1 , t 2 , . . . , t |T| } that is, the solutions in the PF true and |T| is the cardinality of T, p is an integer parameter, in this paper p = 2 andd j is the Euclidean distance from t j to its nearest objective vector q in Q, according to (6).
where f m(t) is the m-th objective function value of the t-th member of T and M is the number of objectives. Coverage: Represents the dominance between set A and set B [27]. It is the ratio of the number of solutions in set B that were dominated by solutions in set A and the total number of solutions in set B. The C metric is defined by (7).

Multi-Objective Job Shop Scheduling Problem
In JSSP, there are a set of n different jobs consisting of operations that must be processed in m different machines. There are a set of precedence constraints for these operations, and there are also resource capacity constraints for ensuring that each machine should process only one operation at the same time. The processing time of each operation is known in advance. The objective of JSSP is to determine the sequence of the operations in each machine (the start and finish time of each operation) to minimize certain objective functions subject to the constraints mentioned above. The most common objective is the makespan, which is the total time in which all the problem operations are processed. Nevertheless, real scheduling problems are multi-objective, and several objectives should be considered simultaneously.
The three objectives that are addressed in the present paper are: Makespan: the maximum time of completion of all jobs. Total tardiness: it is calculated as the total positive differences between the makespan and the due date of each job.
Total flow time: it is the summation of the completion times of all jobs. The formal MOJSSP model can be formulated as follows [34,35]: where q is the number of objectives, x is the vector of decision variables, and S represents the feasible region. Defined by the next precedence and capacity constraints, respectively: t i , t j are the starting times for the jobs i, j ∈ J. p i and p j are the processing times for the jobs i, j ∈ J.
The objective functions of makespan, total tardiness, and total flow time, are defined by Equations (9)-(11), respectively.
where C j is the makespan of job j.
where T j = max(0, C j − D j ) is the tardiness of job j, and D j is the due date of job j and is calculated with D j = τ ∑ m i=1 p j,i [36], where p j,i is the time required to process the job j in the machine i. In this case, the due date of the j job is the sum of the processing time of all its operations on all machines, multiplied by a narrowing factor (τ), which is in the range 1.5 ≤ τ ≤ 2.0 [14,36].

Multi-Objective Proposed Algorithms
The two multi-objective algorithms presented in this section for solving JSSP are Chaotic Multi-Objective Simulated Annealing and Chaotic Multi-Objective Threshold Accepting. We describe these algorithms in this section after analyzing the single-objective optimization algorithms for JSSP.

Simulated Annealing
The algorithm SA proposed by Kirkpatrick et al. comes from a close analogy with the metal annealing process [5]. This process consists of heating and progressively cooling metal. As the temperature decreases, the molecules' movement slows down and tends to adopt a lower energy configuration. Kirkpatrick et al. proposed this algorithm for combinatorial optimization problems and to escape from local minima. It starts with an initial solution and generates a new solution in its neighborhood. If the new solution is better than the old solution, then it is accepted. Otherwise, SA applies the Boltzmann distribution, which determines if a bad solution can be taken as a strategy for escaping from local optima. This process is repeated many times until an equilibrium condition is accomplished.
The SA algorithm is shown in Algorithm 1. Line 1 receives the parameters: the initial (T initial ) and final (T f inal ) temperatures, the alpha value (α) for decreasing the temperature, and beta (β) for increasing the length of the Metropolis cycle. The current temperature (T k ) is set in line 2. An initial solution (s current ) is generated randomly in line 3. The stop criterion is evaluated (line 4); this main cycle is repeated while the current temperature (T k ) is higher than the final temperature (T f inal ). The Metropolis cycle starts in line 5, where a neighboring solution (s new ) is generated (line 6). In line 7 the increment ∆E of the objective function is determined for the current solution (s current ) and the new one (s new ). When this increment is negative (line 8) the new solution is the best. In this case, the new solution replaces the current solution (line 9). Otherwise, the Boltzmann criterion is applied (lines 11 and 12). This criterion allows the algorithm to escape from local optima depending on the current temperature and delta values. Finally, line 16 increases the number of iterations of the Metropolis cycle, and in line 17, the cooling function is applied to reduce the current temperature.

Analytical Tuning for Simulated Annealing
The parameters tuning process for the SA algorithm used in this paper is based on a method proposed in [37]. This method establishes that both the initial and final temperatures are functions of the maximum and minimum energy values E max and E min , respectively. These energies appeared in the Boltzmann distribution criterion that states that a bad solution is accepted in a temperature T when random(0, 1) ≤ e −∆E/T . For JSSP, ∆E is obtained with the makespan. For this tuning method, these two functions are obtained from the neighborhood of different solutions randomly generated. A set of previous SA executions must be carried out for obtaining ∆E max and ∆E min . These value are used in the Boltzmann distribution for determining the initial and final temperatures. Then, the other parameters of Metropolis cycle are determined. The process used is detailed in the next paragraph.
Initial temperature (T initial ): It is the temperature value from which the search process begins. The probability of accepting a new solution is almost 1 at high temperatures so, its cost of deterioration is maximum. The initial temperature is associated with the maximum allowed deterioration and its defined acceptance probability. Let us define s i as the current solution, s j a new proposed solution, E (s i ) and E (s j ) are its associated costs, the maximum and minimum deterioration are ∆E max and ∆E min . Then P(∆E max ), is the probability of accepting a solution with the maximum deterioration and it is calculated with (12). Thus the value of the initial temperature (T initial ) is calculated with (13).
Final temperature (T f inal ): It is the temperature value at which the search stops. In the same way, the final temperature is determined with (14) according to the probability P(∆E min ), which is the probability of accepting a solution with minimum deterioration.
Alpha value (α): It is the temperature decrease factor. This parameter determines the speed at which the decrease in temperature will occur, for fast decrements 0.7 it is usually used and for slow decrements 0.99.
Cooling scheme: This function specifies how the temperature is decreased. In this case, the value of the current temperature (T k ) follows the geometric scheme (15).
Length of the Markov chain or iterations in Metropolis cycle (L k ): This refers to the number of iterations of the Metropolis cycle that is performed at each temperature k, this number of iterations can be constant or variable. It is well known that at high temperatures, only a few iterations are required since the stochastic equilibrium is rapidly reached [37]. However, at low temperatures, a much more exhaustive level of exploration is required. Thus, a larger L k value must be used. If L min is the value of L k at the initial temperature, and L max is the L k at the final temperature, then the Formula (16) is used.
where β is the increment coefficient of L k . Since the Functions (15) and (16) are applied successively in SA from the initial to the final temperature, T f inal and L max are calculated with (17) and (18).
In (17) and (18) n is the number of steps from T initial to T f inal , then (19) and (20) are obtained.
The probability of selecting the solution s j from N random samples in the neighborhood V si is given by (21); and from this equation, the N value is obtained in (22), where the exploration level C is defined in Equation (23).
The length of the Markov chain or iterations of the Metropolis cycle are defined by (24).
To guarantee a good exploration level, the C value determined by (23) must be established between 1 ≤ C ≤ 4.6 [38].

Chaotic Multi-Objective Simulated Annealing (CMOSA)
As we previously mentioned, the AMOSA algorithm was proposed in [17]. However, this algorithm is designed for general purposes. In this work, we adapt the AMOSA for JSSP to include the following features: (1) the mathematical constraints of MOJSSP, and (2) the objective functions makespan, total tardiness, and total flow time.
CMOSA has the same features previously described and has the next three elements: (1) a new structure, (2) chaotic perturbation, and (3) apply dominance to select solutions. These elements are described in the next subsections.

CMOSA Structure
The CMOSA algorithm uses a chaotic phase to improve the quality of the solutions considering the three objectives. Algorithm 2 receives its parameters in line 1: initial temperature (T initial ), final temperature (T f inal ), alpha (α), beta (β), Metropolis iterations in every cycle (L k ), and the initial solution (s current ) to be improved. In lines 2 and 3, the variables of the algorithm are initialized. In line 4, the s current is processed to obtain the values for each of the three objectives as output. In line 5, the initial temperature is established as the current temperature (T k ). Then the main cycle begins in line 6. This cycle is repeated as long as the current temperature is greater than, or equal to, the final temperature. In line 7, the Metropolis cycle begins. Subsequently, the algorithm verifies if it is stagnant in line 8. If that is the case, lines 9 to 20 are executed. The number of iterations to perform a local search is established in line 10; this value is based on the number of tasks of the instance multiplied by an experimentally tuned parameter (in this case, this parameter is timesLS = 10).

Chaotic Perturbation
The logistic equation or logistic map is a well-known mathematical application of the biologist Robert May for a simple demographic model [39]. This application tells us the population in the n-th generation based on the size of the previous generation. This value may be found by a popular logistic model mathematically expressed as: In Equation (25), the variable x n takes values ranged between zero and one. This variable represents the fraction of individuals in a specific situation (for instance, into a territory or with a particular feature) in a given instant n. The parameter r is a positive number representing the combined ratio between reproduction and mortality. Even though we are not interested in this paper in demographic or similar problems, we notice the very fast last variable changes. Then it can be taken as a chaotic variable. Thus, we use this variable for performing a chaotic perturbation function, which may help to escape from local optima for our CMOTA and CMOSA algorithms.
The chaotic function used is very sensitive to changes in the initial conditions, and this characteristic is used to generate a perturbation to the solution for escaping from local optima. Then chaos or chaotic perturbation is a process carried out to restart the search from another point in the space of solutions.
Algorithm 4 can be explained in three steps. Firstly, the feasible operations (operations that can be performed without violating any restrictions) are searched (line 4). Secondly, whether there is only one feasible operation (line 5) means that it is the last operation and selected (line 6). When there is more than one feasible operation, a chaotic function is applied to select the operations. In this case, the logistic function is used (lines [8][9][10][11][12][13][14][15][16][17][18][19], which applies a threshold in the range [0.5 to 1]. Finally, the selected operation is added to the new solution (line 21). This process is applied until all the operations are selected.

Chaotic Multi-Objective Threshold Accepting (CMOTA)
In 1990, Dueck et al. proposed the TA algorithm as a general-purpose algorithm for the solution of combinatorial optimization problems [6]. This TA algorithm has a simpler structure than SA, and is very efficient for solving many problems but has never been applied for MOJSSP. The difference between SA and TA is basically in the criteria for accepting bad solutions. TA accepts every new configuration, which is not much worse than the old one. In contrast, SA would accept worse solutions only with small probabilities. An apparent advantage of TA is that it is higher simply because it is not necessary to compute probabilities or to make decisions based on a Boltzmann probability distribution.
Algorithm 6 shows CMOTA algorithm, where we observe that it has the same structure as CMOSA algorithm. These two algorithms have a temperature cycle and, within it, a Metropolis cycle. In these algorithms, a perturbation is applied to the current solution. Then, the dominance of the two solutions is verified to determine which of them is used to continue the searching process (Algorithm 7). Finally, the increment of the variable that controls the iterations of the Metropolis cycle, the reduction of the temperature, and the increment of the counter (line 39) for the number of temperatures are performed.
In Algorithm 7, the dominance of the two solutions is verified to determine which continues with the search. It has the same three cases used in CMOSA (Algorithm 5). The main differences are the following:

•
In the beginning, while the temperature counter (counter) is less than the value of bound (line 4) T has a value equal to T k (line 5), which is too large, which implies that at high temperature, the new solution (s new ) will often be accepted to continue the search. That is, during the processing of 95% temperatures (parameter limit = 0.95, whose value is obtained with Equation (19) in the tuning process), the parameter γ is used to obtain the value T (threshold), and since γ is equal to 1, then it means that T has the value of T k . For the five percent of the remaining temperatures, γ takes the value of γ reduced (0.978). This parameter is tuned experimentally (line 12), and it is established to control the acceptance criterion and make it more restrictive as part of the process.
• CMOTA includes a verification process for accepting bad solution lighting different from CMOSA. To determine if the searching process continues using a dominated solution, CMOTA does not use the Boltzmann criterion to accept it as the current solution. Instead, CMOTA uses a threshold defined as the T parameter value (line 19), which is updated in line 29. In other words, it is no longer necessary to calculate the decrement of the objective functions. This modification makes CMOTA much more straightforward than CMOSA or any other AMOSA algorithm. Moreover, because the parameter γ is usually very close to one, it is unnecessary to calculate probabilities for the Boltzmann distribution or make a random decision process for bad solutions. T k ← T initial 6: while T k ≥ T f inal do 7: for i ← 0 to L k do 8: if isCaught = TRUE then 9: isCaught = FALSE 10: for k ← 0 to iterationsLocalSearch do 11: if k = 0 then if counter < bound then 5: T ← T k 6: end if 7: if (counter = bound) AND (setT = 1) then 8: setT ← 0 9: end if 11: if setT = 0 then 12: γ ← γ reduced 13: end if 14: if s new ≺ s current then 15: s current ← s new 16: newDominateCurrent ← TRUE 17: end if 18: if s current ≺ s new then 19: if random(0, 1) < T then T ← T × γ 30: end procedure 6. Main Methodology for CMOSA and CMOTA Figure 1 shows the main module for each of the two proposed algorithms CMOSA and CMOTA, which may be considered the main processes in any high-level language.
In this main module, the instance to be solved is read, then the tuning process is performed. The due date is calculated, which is an essential element for calculating the tardiness. The set of initial solutions (S) is generated randomly, as follows. First, a collection of feasible operations are determined, then one of them is randomly selected and added to the solution until all the job operations are added.
Once the set of initial solutions has been generated, an algorithm (CMOSA or CMOTA) is applied to improve each initial solution, and the generated solution is stored in a set of final solutions (F). To obtain the set of non-dominated solutions, also called the zero front ( f 0 ) from the set of final solutions, we applied the fast non-dominated Sorting algorithm [29].
To know the quality of the non-dominated set obtained, the MID, Spacing, HV, Spread, IGD, and Coverage metrics are calculated. To perform the calculation of the spread and IGD, the true Pareto front (PF true ) is needed. However, for the instances used in this paper, the PF true has not been published for all the instances. For this reason, the calculation was made using an approximate Pareto front (PF approx ), which we obtained from the union of the fronts calculated with previous executions of the two algorithms presented here (CMOSA and CMOTA).
As already explained, to perform the analytical tuning, some previous executions of the algorithm are necessary. The parameters used for those previous executions are shown in Table 2, and the parameters used in the final experimentation for each instance are shown in Table 3. The execution of the algorithm was carried out on one of the terminals of the Ehecatl cluster at the TecNM/IT Ciudad Madero, which has the following characteristics: Intel R Xeon R processor at 2.30 GHz, Memory: 64 GB (4 × 16 GB) ddr4-2133, Linux operating system CentOS, and C language was used for the implementation. We developed CMOSA (https://github.com/DrJuanFraustoSolis/CMOSA-JSSP.git) and CMOTA (https://github.com/DrJuanFraustoSolis/CMOTA-JSSP.git) and we tested the software and using three data sets reported in the paper and taken from the literature.
In the first experiment, the algorithms CMOSA and CMOTA were compared with AMOSA algorithm using the 70 described instances and six performance metrics. In a second experiment, we compared CMOSA and CMOTA with the IMOEA/D algorithm, with the 58 instances used by Zhao [14]. In the second experiment, we used the same MID metric of this publication. The third experiment was based on the 15 instances reported in [8], where the results of the next MOJSSP algorithms are published: SPEA, CMOEA, MOPSO, and MOMARLA. In this publication the authors used two objective functions and two metrics (HV and Coverage); they determined that the best algorithm is MOMARLA followed by MOPSO. We executed CMOSA and CMOTA for the instances of this dataset and we compared our results using the HV metric with those published in [8]. However, a comparison using the coverage metric was impossible because the Pareto fronts of these methods have not been reported [8]. In our case, we show in Appendix A the fronts of non-dominated solutions obtained with 70 instances.

Results
The average values of 30 runs, for the six metrics obtained by CMOSA and CMOTA for the complete data set of 70 instances are shown in Tables 4 and 5. We observed that CMOSA obtained the best values for MID and IGD metrics. For Spacing and Spread, CMOTA obtained the best results. For the HV metric, both algorithms achieved the same result (0.42). We observed in Table 5 that CMOSA obtained the best coverage result.
A two-tailed Wilcoxon test was performed with a significance level of 5% (last column in Table 4) and this shows that there are no significant differences between the CMOSA and CMOTA except in MID and IGD metrics.   Table 6 shows the comparison of CMOSA and AMOSA. We observed that CMOSA obtains the best performance in all the metrics evaluated. In addition, the Wilcoxon test indicates that there are significant differences in most of them; thus, CMOSA overtakes AMOSA. We compared CMOTA and AMOSA in Table 7. In this case, CMOTA also obtains the best average results in all the metrics; however, according to the Wilcoxon test, there are significant differences in only two metrics.  We compare in Table 8 the CMOSA and CMOTA with the IMOEA/D algorithm using the 58 common instances published in [14] where the MID metric was measured. This table shows the MID average value of this metric for the non-dominated set of solutions of CMOSA and CMOTA. The results showed that CMOSA and CMOTA obtain better performances than IMOEA/D. We notice that both algorithms, CMOSA and CMOTA, achieved smaller MID values than IMOEA/D, which indicates that the Pareto fronts of our algorithms are closer to the reference point (0,0,0). The Wilcoxon test confirms that CMOSA and CMOTA surpassed the IMOEA/D. The results of CMOSA and CMOTA were compared with the SPEA, CMOEA, MOPSO, and MOMARLA algorithms [8]. In the last reference, only two objective functions were reported, the makespan and total tardiness. The experimentation was carried out with 15 instances and the average HV values were calculated to perform the analysis of the results, which are shown in Table 9. We notice that MOMARLA surpassed SPEA, CMOEA, and MOPSO. We can observe that CMOSA obtained a better performance than MOMARLA and the other algorithms. Comparing CMOTA and MOMARLA, we notice that both algorithms obtained the same HV average results. Table 9. Comparison among SPEA, CMOEA, MOPSO, CMOSA, CMOTA, and MOMARLA using HV.

CMOSA-CMOTA Complexity and Run Time Results
In this section, we present the complexity of the algorithms analyzed in this paper. The algorithms' complexity is presented in Table 10, and it was obtained directly when it was explicitly published or determined from the algorithms' pseudocodes. In this table, M is the number of objectives, Γ is the population size, T is the neighborhood size, n is the number of iterations (temperatures for AMOSA, CMOSA, and CMOTA), and p is the problem size. The latter is equal to jm where j and m are the number of jobs and machines, respectively. Because the algorithms with the best quality metrics are CMOSA, CMOTA MOMARLA, and MOPSO, their complexity is compared in this section.
It is well known that the complexity of classical SA is O(p 2 log p) [45]. However, we notice from Table 10 that CMOSA, and CMOTA have a different complexity even though they are based on SA. This is because these new algorithms applied a different chaotic perturbation and another local search (see Algorithms 2 and 6 in lines [10][11][12][13][14][15][16][17][18][19][20]. The temporal function of MOMARLA, CMOSA, and CMOTA belong to O(Mnp). For MOMARLA, n is the number of iterations, a variable used at the beginning of this algorithm. On the other hand, for CMOSA and CMOTA, n is the number of temperatures used in the algorithm, also at its beginning; in any case, the difference will be only a constant.
We note that AMOSA and MOPSO have a similar complexity class expression, that is O(nΓ 2 ) and O(MΓ 2 ) respectively. However, MOPSO overtakes AMOSA because M is in general lower than n. We observe that CMOSA, CMOTA and MOMARLA belong to O(Mnp) class complexity, while MOPSO belongs to O(MΓ 2 ) [46]. Thus, the relation between them is np/Γ 2 which in general is lower than one. Thus CMOSA, CMOTA and MOMARLA have a lower complexity than MOPSO. Moreover, CMOSA, CMOTA, and MOMARLA have better HV metric quality as is shown in Table 9.
In the next paragraph, we present a comparative analysis of the execution time of the algorithms implemented in this paper.  In Table 11 we show the execution time, expressed in seconds, for the three algorithms (CMOSA, CMOTA, and AMOSA) implemented in this paper for three data sets (70, 58, and 15 instances). In all these cases, we emphasize that the AMOSA algorithm was the base to design the other two algorithms. In fact, all of them have the same structure except that CMOSA and CMOTA apply chaotic perturbations when they detect a possible stagnation. Thus, all of them have similar complexity measures for the worst-case. Table 11 shows the percentage of time saved by these two algorithms concerning AMOSA. For these datasets, we measured that AMOSA saved 2.1, 19.87, and 42.48 percent of the AMOSA run time; on the other hand, these figures of CMOTA versus AMOSA are 55, 68.89, and 46.73 percent. Thus, both of our proposed algorithms CMOSA and CMOTA are significantly more efficient than AMOSA. Unfortunately, we do not have the tools to compare these algorithms versus the other algorithms' execution time in Table 1. Nevertheless, we made the quality comparisons by using the metrics previously published.

Conclusions
This paper presents two multi-objective algorithms for JSSP, named CMOSA and CMOTA, with three objectives and six metrics. The objective functions for these algorithms are makespan, total tardiness, and total flow time. Regarding the results from the comparison of CMOSA and CMOTA with AMOSA, we observe that both algorithms obtained a well-distributed Pareto front, closest to the origin, and closest to the approximate Pareto front as was indicated by Spacing, MID, and IGD metrics, respectively. Thus, using these five metrics, we found that CMOSA and CMOTA surpassed the AMOSA algorithm. Regarding the volume covered by the front calculated by the HV metric, it was observed that both algorithms, CMOSA and CMOTA, have the same performance; however, CMOSA has a higher convergence than CMOTA. In addition, the proposed algorithms surpass IMOEA/D when MID metric was used. Moreover, we use the HV to compare the proposed algorithms with SPEA, CMOEA, MOPSO, and MOMARLA. We found that CMOSA outperforms these algorithms, followed by CMOTA, MOMARLA, and MOPSO.
We observe that CMOSA and CMOTA have similar complexity as the best algorithms in the literature. In addition, we show that CMOSA and CMOTA surpass AMOSA when we compare them using execution time for three data sets. We found CMOTA is, on average, 50 percent faster than AMOSA and CMOSA. Finally, we conclude that CMOSA and CMOTA have similar temporal complexity than the best literature algorithms, and the quality metrics show that the proposed algorithms outperform them.

Acknowledgments:
The authors would like to express their gratitude to CONACYT and TecNM/IT Ciudad Madero. In addition, the authors acknowledge the support from Laboratorio Nacional de Tecnologías de la Información (LaNTI) for the access to the cluster.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Non-Dominated Front Obtained
The non-dominated solutions obtained by CMOSA algorithm for the 70 instances used are shown in Tables A1-A6, and the non-dominated solutions obtained by CMOTA algorithm for the same instances are shown in Tables A7-A12. In these tables, MKS is the makespan, TDS is the total tardiness and FLT is the total flow time. For each instance, the best value for each objective function is highlighted with an asterisk (*) and in bold type.    LA31  LA32  LA33  LA34  LA35  MKS TDS  FLT MKS TDS  FLT MKS TDS  FLT MKS TDS  FLT MKS TDS  FLT  LA36  LA37  LA38  LA39  LA40  MKS TDS  FLT MKS TDS  FLT MKS TDS  FLT MKS TDS  FLT MKS TDS  FLT  Table A4. Non-dominated front obtained by CMOSA for the JSSP instances proposed by [43]. ABZ6  ABZ7  ABZ8  ABZ9  MKS TDS  FLT MKS TDS FLT MKS TDS  FLT MKS TDS  FLT MKS TDS  FLT  Table A5. Non-dominated front obtained by CMOSA for the JSSP instances proposed by [44].  TA41  TA51  TA61  TA71  MKS  TDS  FLT  MKS  TDS  FLT  MKS  TDS  FLT  MKS  TDS  FLT  Table A7. Non-dominated front obtained by CMOTA for the JSSP instances proposed by [40].   Table A10. Non-dominated front obtained by CMOTA for the JSSP instances proposed by [43]. ABZ6  ABZ7  ABZ8  ABZ9  MKS TDS  FLT MKS TDS FLT MKS TDS  FLT MKS TDS  FLT MKS TDS  FLT  Table A11. Non-dominated front obtained by CMOTA for the JSSP instances proposed by [44].  TA41  TA51  TA61  TA71  MKS  TDS  FLT  MKS  TDS  FLT  MKS  TDS  FLT  MKS  TDS  FLT