A Modified Genetic Algorithm with Local Search Strategies and Multi-Crossover Operator for Job Shop Scheduling Problem †

It is not uncommon for today’s problems to fall within the scope of the well-known class of NP-Hard problems. These problems generally do not have an analytical solution, and it is necessary to use meta-heuristics to solve them. The Job Shop Scheduling Problem (JSSP) is one of these problems, and for its solution, techniques based on Genetic Algorithm (GA) form the most common approach used in the literature. However, GAs are easily compromised by premature convergence and can be trapped in a local optima. To address these issues, researchers have been developing new methodologies based on local search schemes and improvements to standard mutation and crossover operators. In this work, we propose a new GA within this line of research. In detail, we generalize the concept of a massive local search operator; we improved the use of a local search strategy in the traditional mutation operator; and we developed a new multi-crossover operator. In this way, all operators of the proposed algorithm have local search functionality beyond their original inspirations and characteristics. Our method is evaluated in three different case studies, comprising 58 instances of literature, which prove the effectiveness of our approach compared to traditional JSSP solution methods.


Introduction
Scheduling problems have been extensively researched in recent years because it is a high complexity combinatorial optimization problem and it is classified as NP-Hard. Among the machine scheduling problems, there are several variations, such as Job Shop Scheduling Problem (JSSP), Flexible Job Shop Scheduling Problem (FJSSP), Flow Shop Scheduling Problems (FSP), and so forth. In this paper, we will consider the JSSP variant. JSSP is a combinatorial optimization problem composed of a set of Jobs to be processed in a set of Machines. So that to solve the problem it is necessary to find a sequence of Jobs for each Machine to optimize a specific performance criterion, for example, the makespan, which corresponds to the total processing time of all Jobs [1,2].
In recent years, several meta-heuristics approaches have been proposed to treat the JSSP, such as Greedy Randomized Adaptive Search Procedure (GRASP) [3], Local Search Genetic
We can see in the brief list above that some meta-heuristics have been more widespread in the literature in relation to the application in JSSP than others and, therefore, these meta-heuristics have greater amounts of publications in this field. For example, we can highlight GA, which was widely used in a much larger number of published works with applications in JSSP than the Single Seekers Society method, which, to the best of our knowledge, was applied to only one work in this field. This is because, among the variety of meta-heuristics that address the JSSP and present good results, GA achieved great prominence for offering a good performance due to its global research capacity. In addition, algorithms that use hybrid meta-heuristics are standing out among the various methods used in JSSP since they can combine the merits of different meta-heuristic algorithms. Hybrid algorithms generally perform remarkably well, incorporating local research to obtain an appropriate combination of diversification and intensification efforts. Among the great diversity of studies, it is possible to verify the superiority of the meta-heuristics of global search behavior with the inclusion of techniques specialized in performing a local search. This type of procedure is addressed in some works, recent or not, such as the approaches in References [4,6,9,[11][12][13]22], among others.
In the following paragraphs, some of the literature that deals with the JSSP through the approach of several meta-heuristics will be detailed. The following detailed works are References [4,6,22]. We chose these techniques because they represent significant advances in the solution of the JSSP using local search strategies in GAs.
Ombuki and Ventresca [4] proposed the LSGA meta-heuristic to treat the JSSP to minimize the makespan, that is, the maximum completion time. The proposed LSGA is a GA with a local search, which has an operator similar to the mutation that is aimed at local research to further improve the quality of the solution. In LSGA, local search is applied probabilistically, that is, it applies the simple mutation or the local search mutation selected dynamically in each generation of GA. In a simple mutation, exactly one exchange is allowed, that is, the positions of two consecutive jobs selected at random are exchanged. In a local search mutation, a systematic approach is used to consecutively test multiple swaps in terms of chromosome lengths, the best improvement obtained by the swap is saved, but if no improvement was found, no swap will be saved in the original solution and the process will be finished. LSGA found solutions whose makespan results were equal to the best-known values or when worst, were within the maximum error range of 10%. LSGA obtained a better search behavior and achieved better makespan results than a canonical GA, so it is possible to say that the local search strategy included in GA improved its search technique and, with that, its solutions obtained.
In the article by Watanabe, Ida and Gen [22], the meta-heuristics GA with search area adaptation (GSA) was proposed to solve the JSSP. The proposed GSA has an adaptation of the search area with the ability to adapt to the structure of the solution space and to control the balance between global and local searches. The crossover operation of the GSA consists of performing the crossover several times on all pairs of parents, each time a new cutoff point is determined. The crossover is repeated until a better child is found than the worst individual in the population or until a certain number of iterations is reached. The mutation operation consists of executing disturbances several times on all children, performing several swaps. The mutation is repeated until a better mutant child is found than the worst individual in the population or until a certain number of iterations is reached. The GSA was compared with a GA and showed better results. The GSA achieved greater frequency in finding solutions closer to optimal solutions. Although the GSA did not find the best-known solution in the tested instances, it showed an improvement in canonical GA.
The meta-heuristic aLSGA, proposed by Asadzadeh [6], was created specifically to solve the JSSP. The proposed aLSGA is a GA with a local search with the inclusion of intelligent agents. The method consists of a multi-agent system, in which each agent has a specialized behavior to implement the local search. The aLSGA combines local search heuristics with crossover and mutation operators. The proposed method has two local search procedures, the first is called the Local Search Agent (LSA) and the second is the Elite Local Search Agent (ELSA). The first is responsible for exploring the neighborhood of each chromosome produced by the crossover operator and the second is responsible for improving the quality of the best chromosome in the population. The computational results showed that the proposed aLSGA is effective in finding optimal and almost optimal solutions for several instances of JSSP. It is noted that by including a local search heuristic in GA, an improvement is obtained in its performance and in its convergence speed.
Several approaches applied in JSSP have been proposed, some have included intelligent agents, parallel populations, or hybridization of meta-heuristics with local search techniques. It is verified through the reported works that hybridization is an effective way to improve the performance of several meta-heuristics, and presents relevant results in the literature. Local search techniques are the most common form of hybridization and are used to improve the performance of these algorithms. It is precisely in this scope that we intend to act, with the proviso that our methodology is presented as an extension and improvement of References [4,6,22] to define a GA portfolio with specialized local search for JSSP.

Formulation of Job Shop Scheduling Problem
JSSP is a combinatorial optimization problem belonging to the NP-Hard class of computational problems, which means it is a problem whose processing time is non-polynomial. Specifically, a JSSP can be defined as being a set of N jobs to be processed into a set of M machines. In JSSP, each job must be processed by all M machines and each job has a sequence of operations with M distinct elements belonging to the set of all possible operations. The sequences of operations are usually different for each job. The scheduling problem of JSSP-type production is finding a job sequence for each machine to optimize a specific performance criterion, which is usually makespan. Some restrictions must be followed in this problem [2]: • Each job can be processed on a single machine at a time; • Each machine can process only one job at a time; • Operations are considered non-preemptive, that is, cannot be interrupted, • Configuration times are included in processing times and are independent of sequencing decisions.
In this work, makespan is adopted as a measure of performance, which is considered to be the total time required to complete the production of a series of jobs. The makespan performance measurement formulation is generally used in JSSP approaches as an objective function that guides algorithms using meta-heuristics to search for appropriate solutions.
Mathematically, suppose the following specifications of JSSP: Thus Reference [62], the makespan measure of an operation sequence O can be defined as the value presented in Equation (1).
which is a measure given according to the order of operations defined in O, since the time that each job takes to be considered finished is given according to the processing order defined in the schedule.

Multi-Crossover Local Search Genetic Algorithm for JSSP
In this section, we will discuss fundamental concepts to the execution of the proposed algorithm and we will also specify the improved methods defined in this work for better efficiency. In short, our contributions are comprised in the following topics: • An improved crossover operator (Section 4.4) based on the version of Reference [22], including a multi-crossover strategy with the goal of increasing the search capability of the method by using a framework based on a set of crossover functions.

•
An improved local search technique (Section 4.5) in union with a generalized version of the mutation operator proposed in Reference [6] and in Reference [4], including a variable parameterization.

•
An improved version of the elite local search operator (Section 4.6) of Reference [6], expanding the search space by utilizing a set of mutation functions.
In this way, our contribution starts in fact from Section 4.4, with the previous Sections 4.1-4.3 of a descriptive and informative nature that present basic concepts of the operation of our method and most of the GA-like methods specialized in solving the JSSP.

Genetic Representation
Except for the presence of specific operators of each work, the basic structure of a GA continues to be formed by the repetition loop that involves two main operators: crossover operator and mutation operator. This structure is preserved in the vast majority of state-of-the-art techniques. The codification used to represent a possible solution (chromosome) of the problem can be done in many different ways, as highlighted in Reference [63].
In this paper, a codification equivalent to one of the most common representations of the literature is used, which is known as "coding by operation order", first presented in Reference [64]. Since, in this representation, the solution space of a JSSP of N jobs and M machines is formed by chromosomes c ∈ N N·M , such that exactly M coordinates of c are equal to i (representing the job index i), for every i ∈ {1, 2, ..., N}. Figure 1 shows some examples of chromosomes (c 1 , c 2 and c 3 ) that obey such formulation in a JSSP with 2 jobs (N = 2) and 3 machines (M = 3). As the formulation requires, index job 1 and index job 2 appear exactly 3 times, since 3 is the number of machines in the problem. This codification determines that the priority of each operation on machine allocation. As an example, let c = (1, 2, 1, 1, 2, 2) be a chromosome in a 2 × 3 dimension JSSP. In this case, the order established by c defines that the following actions must be performed sequentially and it should only be initiated if it can be performed in parallel with the previous action or if the previous action has already been completed: • (1st) Job 1 must be processed by the 1st machine of its script. • (2nd) Job 2 must be processed by the 1st machine of its script.
• (3rd) Job 1 must be processed by the 2nd machine of its script. • (4th) Job 1 must be processed by the 3rd machine of its script. • (5th) Job 2 must be processed by the 2nd machine of its script. • (6th) Job 2 must be processed by the 3rd machine of its script.
Thus, one way to generate initial population in this type of configuration is to create a group of chromosomes equal to (1, ..., 1, 2, ..., 2, ..., N, ..., N), in which each of the N jobs of a JSSP appears in exactly M positions, and then randomly rearrange all coordinates of each chromosome. Thus, one way to generate initial population in this type of configuration is to randomly rearrange to the coordinates of each chromosome M representations of each of the N jobs of a JSSP. Mathematically, we consider the function f shuffle that randomly reorders the coordinates of a given vector, defined in Equation (2): where i 1 , i 2 , ..., i N·M are a random arrangement of indices 1, 2, ..., N · M.

Fitness Function
The objective function, or fitness function, of the optimization problem discussed here can be modeled according to the function F, defined in Equation (3) given below: where O is the set of all possible sequences for the defined JSSP. That is, if O ∈ O, then O is a possible sequence of operations that defines the start processing priority for N jobs on M machines. In other words, O is the feasible set of solutions in which our method must perform its search. The lower the makespan value of a schedule, the less time must be taken to finish a given set of jobs. Thus, the algorithm should look for configuration options for a schedule in order to minimize the time spent to complete the jobs processing on the set of machines that configure the JSSP.

Selection Mechanism
Selection strategies are used so that we can choose individuals to reproduce and to create new populations in evolutionary algorithms. In this paper, individuals should be selected to participate in the crossover process with probability equivalent to their fitness value, which is known as roulette wheel selection [65]. In this case, the individuals selected for crossover must define a set entitled P selected . The selection approach for generating a new population used in the proposed algorithm was the roulette wheel selection with retaining model, in which the probability of an individual being selected is proportional to its fitness value and, certainly, the best individual in the current population is transferred to the new population. It is important mentioning that there are different mechanisms for selecting individuals available in the specialized literature, however, we will focus our analysis on the models mentioned, since these models are widely used in studies that address the JSSP and present good results in this field, as is the case of the advances brought by Ombuki and Ventresca [4], Asadzadeh and Zamanifar [5] and Asadzadeh [6], which only use roulette wheel selection strategies in their work.

Proposed Multi-Crossover Operator
To detail the operation of our multi-crossover operator, we need to address the use of different crossover functions. In Section 4.4.1, we model the possible functions for this operator and exemplify the operation of two of the most used functions to solve the JSSP. In Section 4.4.2, we present our multi-crossover strategy that makes up our method and consists of one of the contributions of this work.

Crossover Functions
The proposed crossover operator consists of using more than one crossover function in search area adaptation [22] strategies. Thus, our proposal is in the form of a framework that considers a set of n × crossover functions to combine chromosomes. Thus, we define this set to be F × , presented in Equation (4).
Let us consider for our modeling, without loss of generality, that each function f × ∈ F × is a function that combines two parent chromosomes from the feasible set resulting in two children chromosomes. That is, each function of F × takes the form of In this work, we will focus on two main crossover functions for conducting our assessments and evaluations: Partially Mapped Crossover (PMX) [66] and Order-based Crossover (OX2) [67]. These functions are two of the most widely used crossover functions in the specialized literature on JSSP solution by genetic algorithm. In this way, F × = {PMX, OX2} for our experiments, however, the same conclusions of our method can be obtained with any set F × .
OX2 (Figure 2-left) does not require any correction or projection steps as its feasibility is guaranteed by construction. This is because the technique only matches the order in which jobs appear in parents. In detail, initially, a random number of genes are fixed. An offspring inherits in its coordinates the exact position these genes assume in one parent and completes the remaining positions with the other parent's unfixed genes.
PMX (Figure 2-right) combines two chromosomes from two previously randomly defined cutoff points. To generate the child chromosomes, the strategy is to mix the genes internal to the cutoffs in one parent with the genes external to the cutoffs in another parent. This procedure can generate infeasible solutions, which are usually corrected [6] in JSSP applications by projecting the individuals generated into feasible space with respect to the Hamming distance [68]. To exemplify the working of these crossover functions, let's assume that we are going to apply both functions OX2 and PMX to the same pair of individuals Parent 1 = (1, 2, 3, 4, 4, 3, 2, 1) and Parent 2 = (4, 4, 3, 3, 2, 2, 1, 1). In details:

•
In the case of the crossover function OX2:

1.
Initially, it is necessary to decide what will be the values of the genes (jobs) that should maintain their positions in the coordinates of the offsprings. Let's assume that the jobs 2 and 3 were randomly set. In this way, the genes that represent the 2 and 3 index jobs must occupy the same position in the coordinates of the offsprings generated during the transfer of the chosen genes. Then, it is necessary to transfer the other jobs (1 and 4) in the coordinates of the offsprings that did not receive any jobs. However, now Parent 1 will pass its genetic information to Child 2 and Parent 2 will pass its genetic information to Child 1 . Then, respecting the order in which the coordinates that represent the jobs 1 and 4 in parents are arranged, the positions of these genes in the parents are transferred to the offsprings. As a result, we get Child 1 = (4, 2, 3, 4, 1, 3, 2, 1) and Child 2 = (1, 4, 3, 3, 2, 2, 4, 1).

4.
This crossover is completed without the need to perform any correction procedure to make the children generated feasible, since the offsprings will always be rearrangements, or shuffling, of the information contained in the parents.
• In the case of the crossover function PMX: 1. This crossover takes into account just the parents' geometric information. Thus, it is necessary to define two cutoff points randomly. Let's assume that these cutoff points define the jobs represented by the coordinates 3, 4 and 5 as internal sequence. Consequently, the outside of these parents will be defined by the coordinates 1, 2, 6, 7 and 8.

2.
Immediately, the information contained in the parents' internal part is passed to the children, so that the internal part of Parent 1 is passed to the internal part of Child 1  To ensure that both parents pass genetic information to both children, in this step Parent 2 transfers its external part to Child 1 and Parent 1 transfers its external part for Child 1 . Thus, we have Child 1 = (4, 4, 3, 4, 4, 2, 1, 1) and Child 2 = (1, 2, 3, 3, 2, 3, 2, 1).

4.
As this crossover function takes only the geometric information from the parents, it is possible that some offspring is not a feasible solution for the JSSP. And, in fact, this is what happens in this example, since Child 1 represents four times the index job 4, which would correct to represent only twice. The same occurs for Child 2 which represents index jobs 2 and 3 three times. Therefore, it may be necessary to use some techniques for projecting solutions on the feasible space of solutions. This projection is usually done according to the Hamming distance and this is how we are going to proceed in this work. Therefore, the children generated by this crossover are: Child 1 = (2, 3, 3, 4, 4, 2, 1, 1) and Child 2 = (1, 2, 3, 3, 2, 4, 4, 1).
A schematic of the example described is shown in Figure 2.

Multi-Crossover Operator
As a crossover operator, a more embracing and rigorous version of the crossover operator of Reference [22] is proposed. We suppose that the use of distinct crossover techniques increases the power of local exploitation since they define different strategies to combine the same individuals and thus it allows the investigation of different search areas. Thus, the crossover operator works from three randomly selected individuals in the population and occasionally different crossover techniques defined by F × is applied in all possible pairs of these three chromosomes until three offsprings are found that surpass their respective parents or until each pair has performed R c crossovers. Detailed operator schematics are presented in Algorithm 1.

Algorithm 1 Proposed multi-crossover operator.
Input: 3 randomly selected individuals taken from P selected F Fitness function if k == 1 then 3: In the first iteration, evaluate the pair formed by p 1 and p 2 4: else if k == 2 then 5: In the second iteration, evaluate the pair formed by p 1 and p 3 6: else if k == 3 then 7: In the third iteration, evaluate the pair formed by p 2 and p 3 8: end if 9: Evaluate the fitness of the couple selected for the k-th iteration 10: 11: for i := 1 to R c do Try to get a child better than parents a maximum of R c times 12: Pick randomly a crossover function, where pick_randomly (Y) is a function that returns randomly some element from the set Y 13: Generate a pair of children using the selected f × 14: Evaluate the first child 15: Evaluate the second child 16: What is the best offspring? 17: If the first child is the best, then it should be considered for comparison 18 If the second child is the best, then it should be considered for comparison 21:ĉ i :=ĉ i,2

22:
end if 23: if F i < F P 1 or F i < F P 2 then If the best generated child is better than one of the parents 24: break Stop generating offsprings and get out of this loop 25: end if 26: end for 27: i * := arg min i {F i } Take the generated child with the best fitness value 28: c k :=ĉ i * 29: end for Output: Thus, in all possible pairs of three individuals randomly taken from a set of pre-selected individuals P selected , a set of distinct crossover functions are eventually performed until a solution is generated that has a fitness value better than a parent's fitness value or until the algorithm performs R c crossover.
The search criteria of this operator is far stricter than the search criteria of the operators of Reference [22], since the operators of these authors perform crossover until a solution is found whose fitness value is better than the worst fitness value presented in the entire population, and the fitness value of parents is not necessarily important for the procedure. Therefore, the proposed operator should be able to find good solutions more easily than the crossover operator of Reference [22], since it performs a more careful and strict search. Furthermore, the use of different crossover techniques increases the search on feasible space, since the solutions must be generated by distinct crossover methodologies and, therefore, the search area can be explored by distinct strategies.

Proposed Mutation Operator
To detail our mutation operator, we need to discuss the use of different mutation functions. In this way, in Section 4.5.1, we model the possible functions for this operator, and we exemplify the application of three functions widely used in JSSP. In Section 4.5.2, we present the strategy of a local search associated with the mutation operator that makes up our method and consists of one of the contributions of our work.

Mutation Functions
Similar to the crossover operator, the proposed mutation operator works according to a set of n mut mutation functions in a framework. That is, in the mutation process performed in the proposed method, a chromosome may be mutated with a mutation specified by one of the functions of set F mut , defined in Equation (5).
where each function f mut ∈ F mut is a mutation function that operates with respect to two coordinates i and j of a chromosome, making these functions of the form presented in Equation (6).
In our work, we focus on the three most commonly used mutation functions in machine scheduling problems: Swap, Inverse and Insert [69], respectively represented by the functions f swap , f inverse and f insert . Thus, all tests performed on our evaluations are made according to We will represent the operation of these mutation functions with an example. Suppose we are going to perform perturbations using these three functions on chromosome c = (4, 3, 2, 3, 2, 4, 1, 1) and considering the same pair of coordinates (i, j) = (3,8). Therefore, the mutation functions must perturb c as follows: • f swap ((4, 3, 2, 3, 2, 4, 1, 1), (3,8)): The job represented by the coordinate i = 3, which in this case is the index job 2, must change places with the job represented by the coordinate j = 8, which is the index job 1. Thus, the mutant form of c by the f swap function is (4, 3, 1, 3, 2, 4, 1, 2). • f inverse ((4, 3, 2, 3, 2, 4, 1, 1), (3,8)): All coordinates between i = 3 and j = 8 are inverted, acting as an extension of the f swap function that also changes the internal values between the coordinates 3 and 8. Therefore, the mutant form of c by the f inverse function is (4, 3, 1, 1, 4, 2, 3, 2). • f insert ((4, 3, 2, 3, 2, 4, 1, 1), (3,8)): The job represented by the coordinate j = 8 is inserted in the coordinate subsequent to the coordinate i = 3, that is, in the fourth coordinate. Then, jobs represented by the coordinates between i = 3 and j = 8 are transferred to a position ahead. Thus, the mutant chromosome generated by f insert is (4, 3, 2, 1, 3, 2, 4, 1).
In Figure 3, the comparative operation of the considered mutation functions for the discussed example is presented.

Local Search Mutation Operator
As mutation operator it is proposed to generalize local search operator of Reference [6] as a variation of Reference [22] mutation operator. Thus, for each individual generated in the crossover operator, one mutation function of F mut is chosen randomly and applied successively R m times in randomly chosen coordinates of the chromosome keeping the beneficial modifications and proceeding with the mutation method from them, giving rise to a mutant population with the same number of individuals as the child population. However, in order to maintain the traditional characteristics of the mutation operator, which is to cause chromosome disturbance regardless of the presence of process improvement or worsening, the option of apply just one execution of a mutation function corresponding to a simple mutation was added in a percentage of individuals. Therefore, only a percentage LS of the population of children is mutated in a local search form, and a percentage 1 − LS is given a simple mutation. The scheme is presented in Algorithm 2.
The main purpose of this procedure is to perform thorough searches in regions close to known good solutions, as such solutions have been determined to be better than their respective parents and likely to enhance the solutions of previous generations.
The local search and mutation operator of Reference [6] is a special case of our local search mutation operator if we define R m = N · M and F mut = { f swap , f inverse , f insert }. Thus, our methodology consists of a proposal to generalize the tool of Reference [6] with respect to the mutation. This modification, however simple, may be able to save processing on low complexity JSSP instances by setting a small value for R m and a set of mutation functions with few elements. Moreover, this generalization makes the methodology more versatile, since for high complexity instances, we can define R m as a high value and F mut as a set with more elements in order to improve the search capability of the proposed technique.
.., f mut,n mut }) Take a mutation function randomly 2: P mut := {} 3: for c ∈ P child do All the offsprings generated in the multi-crossover operator will be mutated Apply the mutation function 10: Fĉ := F(ĉ) Evaluate the mutant child 11: if Fĉ ≤ F c then If the mutation was beneficial, then 12: c :=ĉ Update the child c and continue from it on the next perturbation 13:

Proposed Massive Local Search Operator
In this work, we propose as massive local search operator an improvement of the elite local search proposed in Reference [6]. This enhancement is through the use of more than one perturbation function, causing the operator to eventually use distinct mutation functions instead of just one, as in Reference [6]. A massive local search operator has as its primary objective to evaluate which disturbances made with respect to some mutation function improve an individual's fitness. The main purpose of this procedure is to perform thorough searches in regions close to known good solutions, as such solutions have been determined to be better than their respective parents and likely to enhance the solutions of previous generations. This procedure is performed taking into consideration all possible combinations within the coordinates of a good individual using different perturbation strategies. In Reference [6], this procedure is performed only with the mutation function f swap on the best individual in the current population. In our work, we propose that this procedure occurs using different perturbation functions (We will call "perturbation functions" the functions used in this section to facilitate the description so that there is no confusion with the mutation functions of Section 4.5). However, the perturbation functions used in the massive local search operator are defined in the same way as the mutation functions according to Equation (6). in a given set of perturbation functions F pert . In detail, we propose to randomly take a function in F pert , defined in Equation (7), and then carry out all possible perturbations considering the coordinates of an individual using this function so that beneficial perturbations to the individual are always maintained.
F pert = f pert,1 , f pert,2 , ..., f pert,n pert , where, f pert,i is a function on the form of the Equation (6) for all i. We emphasize, therefore, that our method does not necessarily demand that F pert = F mut . However, to facilitate the description and execution of the experiments, we will assume that F pert = { f swap , f inverse , f insert }. In this case, the algorithm not only performs the successive substitution of operations in a given solution, but occasionally, the technique performs successive insertions and inversions, increasing the diversification of the massive local search performed. Suppose that, over the generations, using a set of mutation functions instead of just one function can improve the operator's ability to explore the search space.
In this work, we propose that the massive search be applied to a group of individuals and not only on the best individual, as is the case with the elite operator. We believe that this generalization can help the method to avoid premature convergence and stagnation of the population around a local optimum. Mathematically, the massive local search must be applied to all individuals of a given set P massive .
For carrying out the experiments, we assume that P massive is formed by the two best individuals in the population with the restriction that they are different. We intend that two good and different genetic inheritances are maintained in the population in order to help maintain genetic variability.
Thus, the massive local search operator proposed is coded in Algorithm 3.

Algorithm 3
Proposed massive local search operator.

Input:
P massive Set of individuals selected for massive local search for i := 1 to N · M do All the combination of the chromosome coordinates will be considered 6: for j := 1 to N · M do 7:ĉ := f pert (c, (i, j)) Apply the perturbation considering the coordinates i and j 8: Evaluate the fitness of the perturbed individual 9: if Fĉ ≤ F c then If the perturbation was beneficial, then 10: c :=ĉ Update the individual and execute the next perturbations from this version 11 The use of all operators together follows a methodology similar to that used by Reference [6], which consists of the following central steps: 1.
Initiate the configuration of our method, which include choosing the standard definition of parameters and determining the crossover, mutation and perturbation functions; 2.
Generate the initial population using the shuffle function defined in Equation (2); 3.
Select the individuals and perform our multi-crossover operator; 4.
Perform our local search operator on all the offsprings generated on the previous step; 5.
Run the massive local search operator to look for better solutions around a group of good individuals; 6.
Generate a new population using the roulette wheel strategy retaining the best individual; 7.
Return to the first step.
Notice that the method must be used to solve one JSSP instance at a time. Figure 4 shows the flowchart containing all the details about the steps of our proposed mXLSGA, which is the proposed meta-heuristic for application in JSSP instances.
Observing the flowchart of the proposed method, we can see some clear differences between mXLSGA and the techniques that served as inspiration for its making. We can immediately see that the use of multi-crossover strategies with various crossover functions is something unique to the proposed methodology. Furthermore, only mXLSGA uses more than one perturbation function in the massive local search operator and applies them to a set of individuals. In detail, all GA-like methods influenced mXLSGA modeling, however, there are also significant differences, as highlighted in the sequence: The mXLSGA also has these two possible procedures in its mutation operator. However, it uses a group of mutation functions that the method uses during its iterations in these procedures. -Differences: Our mutation operator is divided into two possible subroutines: the first with local search behavior by applying successive perturbations with a group of mutation functions, such as aLSGA; and the second being formed from a simple mutation so that only one perturbation is applied to the chromosome. Unlike aLSGA's elite local search operator, mXLSGA's massive local search operator uses a set of perturbation functions, and not just the function f swap , on a set of individuals and not just on the best individual.
Define initial parameters it = 0 Generate initial population P 0 using f shuffle (Equation (2)) Evaluate P it Select individuals from P it using roulette wheel for crossover and store them in P selected P child = {} There are at least 3 individuals in P selected ?
Take p 1 , p 2 , p 3 randomly from P selected Apply Alg. 1 on p 1 , p 2 , p 3 and get c 1 , c 2 , c 3 Apply Algorithm 3 on P massive and get P improved Create P it+1 from P it ∪ P mut ∪ P improved using roulette wheel retaining the best individual In summary, we present in Table 1 how each of the GA-like techniques uses the strategies mentioned and we compare this use with our mXLSGA.

Implementation and Experimental Results
We demonstrate the effectiveness of the proposed method through three different case studies: the first dedicated to analyzing the operators of the method in isolation to evaluate the influence of each one of them during the solution of JSSP instances; the second specialized in evaluating the capacity of the method to find optimal solutions to the problem addressed, being compared with different types of meta-heuristics that make up the state of the art; and the third dedicated to comparing our method with other GA-like techniques when solving the JSSP.
The proposed algorithm was coded using Matlab software and the tests were performed on a computer with 2.4 GHz Intel(R) Core i7 CPU and 16 GB of RAM. We emphasize that we only used the standard functions of the Matlab IDE to conduct the computational experiments and we do not use ready-made optimization software packages.

Case Study I: Analysis of Isolated Operators
In this case study, we evaluated different configurations of the method to investigate how each of the operators proposed in this work acts on our method. Specifically, we want to observe which operators most strongly influence the proposed method. For this, we evaluated different configurations of our technique in three LA [17] instances. These configurations were stipulated so that we were able to analyze the influence of each of the operators separately. That is, we defined a configuration as a basic GA, another as a basic GA with the multi-crossover operator, another as a basic GA with a version of the proposed local search operator, and so on. For this, we will investigate the operation of the following configurations:  Table 2.
Each of the configurations presented in this case study was executed 35 times in three LA instances of different sizes: LA 03, with a dimension of 20 × 5, considered easy level; LA 17, of size 10 × 10, of medium level; and LA 30, with a size of 20 × 10, of difficult level. These instances were defined so that at least one instance of each of the three main groups of dimensions was considered to verify the efficiency of the algorithm and its behavior in instances of varying dimensions, from the simpler to the most complex. The fitness values for each configuration were used to make the box plots in Figure 5.
Looking at the box plots shown in Figure 5, we can see that the addition of any of the operators proposed in GA was very beneficial. However, some operators have greater influence in certain cases. For example, the multi-crossover operator has greater influence on less complex bases, since in Figure 5a it works as well as or even better than the GA+LS* and GA+LS** configurations, while on more complex bases (Figure 5b,c) the GA+mX configuration presents results statistically inferior to the results of GA+LS* and GA+LS**.
Concerning mutation operators that use local search strategies, we noticed that as the complexity of the analyzed instance increases, the performance of GA+LS* improves in relation to the GA+LS** configuration. This is because, the greater the complexity of the instance, the use of local search strategies becomes more advantageous than the genetic variability guaranteed by the operator with the use of LS = 1. However, we can see in Figure 5a that, in instances of less complexity, the GA+LS** configuration presents much better and more stable results than those presented by GA+LS*. Thus, it is a good strategy to use a value between 0.8 and 1 for LS in the next case studies.
We can also observe that the massive local search operator is the operator that present better quality and greater stability in the results. The addition of massive local search operators in basic GA results in better fitness values compared to the addition of two distinct operators, as is the case with the GA+mX+LS** configuration, which results in lower fitness values than GA+ELSA and GA+MLS configurations in the box plots of Figure 5a,c. Also, the use of more than one disturbance function in F pert has resulted in GA+MLS presenting better and more stable results than the results of GA+ELSA, which uses only the f swap function in F pert , in all evaluated cases. Thus, we can conclude that the operator with the greatest influence is the massive local search operator.
The use of two local search operators in GA + mX + LS** made this configuration more stable than the simplest configurations that use only one operator, in this case the configurations GA + mX, GA + LS* and GA + LS**. In addition, also compared to these configurations, GA + mX + LS** presents better fitness values in the instances LA03 ( Figure 5a) and LA30 (Figure 5c).
Finally, the configuration of the proposed methodology that corresponds to the joint use of all the operators discussed is the mXLSGA method, which presents the best results in all evaluated cases, with greater accent in the LA03 instances ( Figure 5a) and LA17 (Figure 5b). This confirms that the use of all operators concurrently is the best configuration for the methodology since it is in this configuration that the best results are obtained.   Table 3 shows the average time required considering 35 independent executions for all the proposed method configurations to solve the JSSP instances in this case study. As we can see, basic GA is the fastest technique of all compared. However, we note that the addition of the multi-crossover operator (GA + mX) does not radically compromise the computational time, but, as shown in the box-plots of Figures 5a-c, this addition considerably improves the performance of the method in obtaining good results. In the case of versions that use only massive local search operators (GA + ELSA and GA + MLS), it is clear that the performance is improved and that, in smaller instances such as LA03 and LA17, the computational time is not drastically increased. However, that is not the case of the LA30 instance. This is due to the quadratic behavior of these operators, since, as we can see in their formulation in Algorithm 3, in each iteration of the method occurs (N · M) 2 perturbations in the chromosomes from P massive . Something similar occurs with configurations that only use local search strategies in the mutation operator (GA + LS* and GA + LS**). Specifically, we can see that these are the most expensive isolated operators in time cost in the case of smaller instances (LA03 and LA17) and this cost increases proportionally according to the complexity and size of the instance. These facts are reflected in the GA + mX + LS** configuration, whose average time in all instances is approximately the sum of the times of the GA + mX and GA + LS** configurations. In all instances, the configuration that obtains the best performance (mXLSGA) is also the most costly in computational time, but this is an expected result since this technique uses all the operators described in this work. Thus, in the following case studies, we will direct our analysis and comparisons to our mXLSGA.

Case Study II: Mxlsga for JSSP and Comparison with Other Algorithms
In this case study, we will evaluate the capacity of the proposed methodology to find optimal solutions in the search space. For this, we intend to demonstrate the effectiveness of the method when applied in JSSP instances present in the specialized literature.
In details, to evaluate the proposed approach, experiments were performed in 58 JSSP instance scenarios, 3 FT instances [16], 40 LA instances [17], 10 ORB instances [18] and 5 ABZ instances [19]. The results obtained in the execution of the tests were compared with papers from the specific literature. The articles determined for each comparison were selected because they are relevant works in the literature, which deal with the JSSP with the same specific instances and, when existing, papers published in the last three years were adopted. The papers selected for comparison of results were as follows: SSS [61], GA-CPG-GT [14], DWPA [15], GWO [11], HGWO [12], MA [9], IPB-GA [8] and aLSGA [6].
The configuration of parameters for the mXLSGA was established through tests and also taking into consideration, when possible, a closer parameterization of the works that were used for comparison. In this way, the parameters were defined as shown in Table 4.
Two best and different individuals in the population The proposed mXLSGA method was executed 10 times for each JSSP instance and the best value obtained was used for comparison with other papers. In most of the comparative works, the authors do not mention the processing time of their techniques and only present the best result. We proceed in the same way for this case study, reserving a more detailed analysis using time and other statistical measures for the next case study, in which we programmed a version of other compared techniques and, therefore, we were able to observe such measures. Table 5 shows the results derived from the LA [17], FT [16], ORB [18] and ABZ [19] instance tests. The columns indicate, respectively, the instance that was tested, the instance size (number of Jobs × number of Machines), the optimal solution of each instance, the results achieved by each method (best solution found and error percentage (Equation (8)), and the mean of the error for each benchmark (MErr).
where "E % " is the relative error, "BKS" is the best known Solution and "Best" is the best value obtained by executing the algorithm for each instance. As shown in Table 5, mXLSGA found the best known solution in 100% of FT instances, 70% of LA instances, 30% of ORB instances, and 40% of ABZ instances.
The mXLSGA proposal reached in 28 LA instances the best known solution and obtained a mean relative error (MErr) of 0.61. The SSS and HGWO methods obtained a lower average error than our mXLSGA, assuming 0.59 and 0.38, respectively, but they did not test all the LA instances. If we only consider the instances that have been tested by SSS, our mXLSGA proposal would obtain a MErr of 0.46. If we only consider the instances tested by HGWO, our method would get 0.00 from MErr. Specifically, in FT instances, the mXLSGA reached in 3 instances the best known solution and obtained a MErr of 0.00. In ORB instances, the mXLSGA reached in 3 instances the best known solution and obtained a MErr 0.54, it is the method with the lowest MErr of all compared algorithms. In ABZ instances, the mXLSGA reached in 2 instances the best known solution and obtained a MErr of 4.46. The method that achieved a minor error was GA-CPG-GT, but this work did not test in all ABZ instances, and if we compare only the instances that GA-CPG-GT was tested, mXLSGA would have gotten 0.00 relative mean error, that is, the mXLSGA achieved the best known solution in the first 2 ORB instances.
14 - In particular, our method surpassed the technique on which it was based, which in this case is aLSGA. In LA instances, our method got 6 BKS more than aLSGA.The aLSGA obtained in the LA instances a MErr of 0.80 and mXLSGA obtained a MErr of 0.61, but aLSGA was tested only in the first 35 LA instances, if we consider the MErr only for the 35 LA instances, mXLSGA would get a MErr of 0.37, which is less than half the value obtained by aLSGA. For ORB instances, mXLSGA obtained a MErr less than one third of the one obtained by aLSGA. The improvement achieved by mXLSGA is certainly due to the insertion of the multi-crossover operator and the enhancements employed in local search techniques.
Analyzing the data presented in Table 5, we can see that in the tested JSSP instances, the proposed mXLSGA results better or equal to the compared state-of-the-art algorithms.

Case Study III: Statistical Analysis of Ga-Like Methods
This case study was designed and executed to verify the effectiveness of the mXLSGA method when compared to the main GA-like methods that build the specialized literature: the basic GA, GSA [22], LSGA [4], aLSGA [6]. We compared mXLSGA with such techniques because they were the basis of inspiration for its modeling and because they still represent the state of the art in GA-based metaheuristics to solve the JSSP. The effectiveness must be guaranteed through the analysis of different statistical measures and the time of each method. Indeed, we implemented all the compared techniques to perform the analyzes. So that all the methods were coded as faithfully as possible to the descriptions present in their respective articles, however, we emphasize that the coding may differ from the original coding for several reasons, such as: different programming techniques; parameterization of the method, for example, the number of executions for the method; some detail of the method that is not present in the text of the original work or that we have interpreted differently; and so forth.
We try to follow the parameterization as faithful as possible to the original parameterization of each technique, however, we use 100 individuals and 100 iterations in all the considered methods. In details, the configuration used is presented in Table 6. Table 6. Set up for case study III. Where c best is the best individual in the population and c best,1 and c best,2 are the two best individuals in the population who are different.

GA
The compared techniques were executed 35 times on instances of different dimensions in which our method can find the optimal value of makespan considering case study II (Section 5.2). In Table 7, we can see some statistical information about these executions, namely: the best fitness value achieved by the technique (Best); the worst fitness value achieved (Worst); the average of the fitness values of the executions of each technique (Average); the standard deviation of these values (SD); the number of times the method has reached the optimal value (Number opt); the number of iterations (Number it) needed to reach the optimum value; and the average time (AT) in seconds that the technique takes to perform 100 iterations.
As shown in Table 7, it is noteworthy that mXLSGA found the optimal fitness value in all instances analyzed, and in instances FT 06, LA 01, LA 06 and LA 11, the proposed method found the optimal value in all executions. The mXLSGA presented the lowest worst value and the lowest average in all instances. In the instance LA 16, LA 23, and LA 26, mXLSGA did not obtain the lowest standard deviation value, however, it was the only method that found the optimal value in all these instances. Our method was the method that required the longest average execution time, but as it will be better explained in the following paragraphs, this is not in any way a deficiency in our method, as it does not need all 100 generations to converge. To visualize the statistical performance of each method, we made the box plots of the fitness values achieved in this case study, which are shown in Figure 6. As highlighted in the case study I (Section 5.2), box plots make it clear how the presence of a massive local search operator makes the method much more robust compared to the others, since in all the instances the aLSGA and mXLSGA methods are the most stable and have the best fitness values. This difference is even accentuated as the complexity of the instance increases. Also, the graphs reflect the values presented in Table 7, since mXLSGA is the box below the other boxes in all evaluations. We also noticed some details, such as why the mXLSGA standard deviation is the largest when executed in the LA 26 instance since this is because the technique finds the optimum value 6 times as a discrepancy. It is also clear from the graph that, statistically, our method is the method that most finds the best makespan settings.  In Figure 7, we highlight the convergence curves of the fitness value of the best individual of the 35 executions of each method in each instance during the 100 generations. It is clear that, as the complexity of the assessed instance increases, the number of iterations that each method needs to achieve the best value also increases. However, in none of the situations did our method requires all the 100 generations to find the best value. Indeed, mXLSGA finds the optimal value of makespan, or very close to optimum, with approximately 50 generations in more difficult instances. Whereas, in the case of simpler instances, such as FT 06 (Figure 7a), LA 01 (Figure 7b), LA 06 ( Figure 7c) and LA 11 (Figure 7d), we realize that our method achieves optimal fitness before the 5th iteration. This fact does not occur with the other evaluated techniques, which need more iterations to reach the optimal value or, with the configuration used, they were not able to reach any of the optimum points, as was the case with all techniques with exception of mXLSGA in instances LA 16 (Figure 7e), LA 23 (Figure 7f), LA 26 ( Figure 7g) and LA 31 (Figure 7h). In this way, time is no longer a major concern for our method, since it requires only 50% of the iterations used to achieve good results in larger instances, which reduces the time spent in half, or it only needs 5% of iterations to achieve the optimum in simpler instances.  In the sequence, still concerning the computational time analysis, we will evaluate the following situation: in each instance considered in this case study we will execute all the techniques taking as a stopping criterion the computational time instead of the maximum number of iterations. That is, all techniques will be performed for the same amount of time. In this way, for each instance, we will take as a time limit for all techniques the time it took the most time-consuming method to complete 100 iterations with respect to that instance according to the Table 7. For example, for this experiment, all methods have 39.78 seconds to search for the optimal value of the instance FT-06, since this amount of time is the same amount that the GSA technique, which is the most time-consuming method in this instance, takes to perform 100 iterations on this JSSP instance. All techniques were performed independently 35 times following the strategy described. The results are summarized in Table 8. practically all measures considering all instances. However, the complexity of the evaluated instance remains a predominant factor with respect to the performance of the technique in this experiment, since in the case of a less complex instance such as FT06, all the evaluated methods were able to find the optimal value 55 in all the 35 executions. In the case of more complex instances such as instances LA16, LA23, LA26, and LA31 only the proposed method was able to find the optimal value, even with all techniques being able to be executed with the same amount of time. In addition, also in these instances, our method presented without a tie the best statistical measures of best fitness value, worst fitness value, and average fitness. In other words, our methodology presented the best performance in more complex instances and tied these measures in simpler instances in this experiment. This serves as an indication that the methodology proposed in this work provides better searchability for the technique, making it more efficient and surpassing other GA-like algorithms present in the literature.

Conclusions
The objective of this work was to develop an approach to optimize the makespan in the job shop scheduling instances. The proposed technique for achieving the goal was a GA with improved local search techniques and a multi-crossover operator. To evaluate the proposed approach, experiments were conducted in three different case studies.
In the first case study, all operators of mXLSGA were individually evaluated. It was found that all operators jointly corroborate the method for improving the results obtained. That is, no single operator obtained better results than the complete method by all operators. However, we can observe that the most influential operator is the massive search operator, which has greater search power than the modified mutation and multi-crossover operators that we propose.
By analyzing the results obtained in the second case study, we can observe that the proposed method achieves competitive results in JSSP instances and it is able to find good makespan results. The mXLSGA obtained competitive MErr with respect to the results achieved by the compared algorithms in the LA, ORB and ABZ instances. In the FT instance mXLSGA got 0.00% error and tied with the aLSGA algorithm. Through the analysis of the results we can see that mXLSGA is a competitive and versatile method that achieves good results in instances of varying complexity.
In the last case study, we compared our method with the techniques on which its modeling was based. In this case, the GA-like techniques that make up the state of the art in JSSP solution with GA-based meta-heuristics. In this case study, our method obtained the best statistical measures compared to the other techniques. However, the computational time used to obtain these results considering 100 iterations for each method was the largest for mXLSGA. In addition, in this case study we show that, according to the convergence curves, our method needs only 5% of all iterations to achieve the optimal solution in small instances, and only half of the iterations to obtain the optimal solution or very close to the optimal solution in larger instances. The fact that does not occur with the other compared techniques. Thus, time is not a concern for mXLSGA, as it works well even with the use of less demanding configurations. We finished this case study with an experiment in which all GA-like methods could be executed during the same amount of time according to each JSSP instance considered and, observing the results, we concluded that the proposed approach presents better performance than the others GA-based approaches, especially when considering more complex instances.
Analyzing the three case studies presented in this paper, we conclude that mXLSGA is a robust method, with the ability to obtain good results in instances of different complexities and that it has a faster convergence rate when compared to other GA-like methods. The mXLSGA presents better or at least competitive results when compared to other meta-heuristics found in the specialized literature.
In future works, we will study the feasibility of the method in similar combinatorial optimization problems, such as flexible job shop scheduling problem, flow shop scheduling problem, and so forth. Also, we will consider the inclusion of adaptive rules in the discussed operators to control the auto-configuration of parameters. For example, in a future version of our mXLSGA, the method should automatically adjust the crossover and the mutation rates during the iterations according to its performance.