A Hybrid Bat Algorithm for Solving the Three-Stage Distributed Assembly Permutation Flowshop Scheduling Problem

: In this paper, a hybrid bat optimization algorithm based on variable neighbourhood structure and two learning strategies is proposed to solve a three-stage distributed assembly permutation ﬂowshop scheduling problem to minimize the makespan. The algorithm is ﬁrstly designed to increase the population diversity by classifying the populations, which solves the difﬁcult trade-off between convergence and diversity of the bat algorithm. Secondly, a selection mechanism is used to update the bat’s velocity and location, solving the difﬁculty of the algorithm to trade-off exploration and mining capacity. Finally, the Gaussian learning strategy and elite learning strategy assist the whole population to jump out of the local optimal frontier. The simulation results demonstrate that the algorithm proposed in this paper can well solve the DAPFSP. In addition, compared with other metaheuristic algorithms, IHBA has better performance and gives full play to its advantage of ﬁnding optimal solutions.


Introduction
The distributed assembly scheduling problem is a derivative and extension of the assembly scheduling problems. It is mainly used to produce multiple products assembled from different parts. In a decentralized and globalized economy, the current economic trend of customization and intelligent manufacturing has led to the rapid development of assembly production. Internationally, international companies with multiple production centers or plants are even more common. This shows that distributed and intelligent production plays a pivotal and irreplaceable role in the modern manufacturing industry. Currently, this type of scheduling is widely used in the supply chain and manufacturing industry. In order to maintain a competitive position in a rapidly changing market, managers must quickly make the right decisions about how to allocate work to plants and how to schedule each plant efficiently. Therefore, the distributed scheduling problem has become a hot research topic among scholars and researchers.
The distributed assembly permutation flowshop scheduling problem (DAPFSP) is of great importance to both the industrial industry and the research community. As we all know, DAPFSP consists of two phases, including production and assembly. In detail, the production phase corresponds to a distributed displacement flowshop scheduling problem, where n jobs are assigned to be processed in f plants with m identical machines in a flowshop line. Each job belongs to a certain product. The processes contained in each product cannot be interrupted or inserted by processes of other products during the production process. The assembly phase corresponds to the assembly flow shop scheduling problem, where s products are made in an assembly plant with a single assembly machine, and each product must be assembled after all processes produced in the production phase are completed.
Total completion time has been identified as a more relevant and meaningful goal in today's dynamic manufacturing environment when a batch of work needs to be completed as quickly as possible. This allows the objective function to minimize the flow of work and efficiently improve resource utilization. DPFSP is an NP-hard with m greater than or equal to 2, and is a derivation and extension of the traditional permutation flow shop scheduling problem with this criterion. Similarly, DAPFSP can be seen as an extension of DPFSP, which is more complex than DPFSP. Therefore, the DAPFSP with completion time as the objective function is also a strong NP-hard problem. In recent years, as a simple yet efficient method, the bat algorithm has solved many scheduling problems in academia, including the job shop scheduling problem [1], the open shop scheduling problem [2], the task scheduling problem [3], and the production scheduling problem [4], by continuously performing local search in the neighborhood of the current solution to obtain the optimal solution.
In this paper, an improved hybrid bat algorithm, IHBA, is proposed for solving the DAPFSP with maximum makespan based on the original bat algorithm. The main contributions of this paper are: • The algorithm firstly designs the population classification to increase the diversity of the population and solve the difficult trade-off between convergence and diversity of the bat algorithm. • A selection mechanism is used to update the speed and location of bats, solving the difficulty of the algorithm to trade-off exploration and mining capacity. • Gaussian learning strategy and elite learning strategy assist the whole population to jump out of the local optimal frontier. This paper is organized as described below. Section 2 reviews the closely related literature. Section 3 presents the DAPFSP problem and its mathematical model. In Section 4, the hybrid bat algorithm proposed in this paper is described in detail. In addition, the parameter calibration and simulation calculation results are analyzed in Section 5. In the last section, the paper is summarized, and future work is provided.

Literature Review
It is well known that the distributed permutation flowshop scheduling problem (PFSP) is one of the most well-known production scheduling problems. It has a strong engineering background by having the same work order on all machines. The problem has been shown to be a strong NP-hard problem when more than two machines are involved.The problem of distributed assembly permutation flowshop scheduling problem (DAPFSP) has been gaining attention and popularity among scholars and researchers. The most leading exponent is Hatami et al. [5], whose work is very enlightening. He was the first to optimize and improve the DAPFSP for modeling and studying complex supply chains in 2013. They also introduced the mixed integer linear programming model, proposed three construction algorithms, tested the variable neighborhood descent (VND) algorithm, and demonstrated the good performance of the VND algorithm in solving this scheduling problem. Similarly, he went on to expand on his previous paper by adding time for sequence-related setups in both the production and assembly stages [6]. In 2015, Hatami et al. [7] considered a distributed assembly scheduling problem with sequence-dependent setup times and the goal of minimizing production time. The setup times of both phases are sequence-dependent, which also lays the theoretical foundation for the sequences in this paper. Heuristics and meta-heuristics are also proposed to solve it. Based on his literature and views, many of these ideas and opinions have been studied and explored in greater depth by many scholars. Based on these rather cutting-edge ideas, many scholars have built on and deepened their research, and Ying et al. [8] extended the DAPFSP for flexible assembly and sequence-independent setup times in supply chains. Using completion time as an optimization criterion, construction heuristic and custom meta-heuristic algorithms are proposed to solve this emerging scheduling extension. Gonzalez-Neira et al. [9] studied a stochastic version of the DAPFSP and proposed a hybrid algorithm to solve the stochastic problem, integrating biased randomization and simulation techniques. Wang [10] introduced a just-in-time constraint between the two phases of the traditional DAPFSP to form a just-in-time DAPFSP to minimize the maximum weighted tardiness cost. A variable neigh-borhood search based on memory algorithm is proposed. From the above reviewed studies, it can be concluded that these scholars and researchers have extended and improved this problem, and most of them have an objective function based on minimizing the makespan, which provides a theoretical basis for the objective function in this paper.
Most scheduling problems are NP-hard and exact methods simply cannot find the optimal solution in reasonable computational time. Therefore, heuristic and meta-heuristic algorithms have been used to find the optimal solution and near-optimal solutions in reasonable computation time. Zhang et al. [11] pointed out that the DAPFSP is a typical NP-hard combinatorial optimization problem, and proposed an innovative three-dimensional matrix cube-based distribution estimation algorithm to address the difficulties of the proposed DAPFSP-T model. Likewise, Zhang et al. [12] also assumed this idea, and their team integrated the problem with two machine environments, distributed production and flexible assembly, which can assemble job processing into customized products, and proposed a mixed integer linear programming model to characterize the problem nature and solve the small-scale problem. An efficient modulo algorithm is further proposed. They affirm that the modal algorithm is an efficient algorithm to solve the DAPFSP, while some scholars prefer to use the greedy algorithm to solve the problem. For example, Pan et al. [13] solved a novel DAPFSP by proposing a mixed integer linear model, three construction heuristics, two variable neighborhood search methods, and an iterative greedy algorithm to obtain important problem-specific knowledge to improve the effectiveness of the algorithm. Similarly, Ochi et al. [14] studied the DAPFSP with the objective of minimizing completion time and, proposed an iterative greedy based approach called bounded search iterative greedy algorithm. Similarly, in order to coordinate production and transportation scheduling, Yang [15] proposed a novel DAPFSP-FABD model. The objective is to minimize the total cost of delivery and delay, and four heuristic algorithms, a variable neighborhood descent algorithm, and two iterative greedy algorithms are proposed. Liu et al. [16] proposed a memory algorithm for DAPFSP based on variable neighborhood search with the objective function of, i.e., makespan. And, the initialization based on NEH heuristic is applied to the product ordering. The neighborhood structure is introduced into VNS and used to perturb the job assignment of the factory and adjust the job order of the factory. Huang et al. [17] considered the DAPFSP with a total flow time criterion, and proposed an improved iterative greedy algorithm based on groupthink to solve the problem. It can be seen that the exact algorithm is no longer able to solve the problem, and this has promoted the development of heuristic and meta-heuristic algorithms, which can very definitely find the optimal solution as a solution to this scheduling problem.
More and more scholars are focusing more on the improvement of algorithmic aspects. For the no-wait flow shop scheduling problem that depends on time series, Hu et al. [18] proposed an enhanced differential evolutionary algorithm. Subsequently, Seidgar [19] took minimizing the weighted sum of expected completion time and average completion time as the solution objective and proposed four metaheuristic algorithms; namely, genetic algorithm, imperialistic competition algorithm, cloud theory-based simulated annealing, and adaptive differential evolution algorithm. Moreover, Li et al. [20] argued that the imperialist competition algorithm can solve the fuzzy distributed assembly flow shop scheduling problem. Furthermore, Al-Behadili et al. [21] proposed a multi-objective optimization model and particle swarm optimization solution method for the robust dynamic scheduling of permutation flow shops with uncertainty. Likewise, Zhang [22] emphasized that DAPFSP is a new generalization of the distributed displacement flow shop scheduling problem and the assembly flow shop scheduling problem, proposed an enhanced population-based metaheuristic genetic algorithm, and designed an effective crossover strategy based on local search to accelerate convergence. Li et al. [23] studied the minimization problem of DAPFSP and proposed a genetic algorithm with an enhanced crossover strategy and three different local searches. It is no coincidence that Mao et al. [24] advocated an improved discrete artificial bee colony algorithm to solve the DAPFSP with preventive maintenance operator, optimized by the completion time criterion. Moreover, a local search method with insertion and exchange operators is used to generate adjacent solutions in the hiring bee phase and the bystander bee phase. Song and Lin [25] jointly proposed a genetic programming hyperheuristic algorithm to solve the DAPFSP with sequence-dependent setup times, minimizing the completion time as the objective. For the improvement of the algorithm, these scholars further study the genetic algorithm and particle swarm algorithm. This also provides a reference basis for our subsequent simulation experiments.

Problem Description
The traditional distributed assembly permutation flowshop scheduling problem (DAPFSP) can be divided into two phases; namely, the production phase and the assembly phase. In this section, the two-stage DAPFSP problem is extended to a three-stage DAPFSP, named the production stage, the transportation stage, and the assembly stage, and the problem is described as follows.
The job j ∈ {1, 2, . . . , n} consists of operations O ij , O Tj and O Aj . In the production stage, there are n work processes and f identical plants, and job j ∈ {1, 2, . . . , n} needs to be assigned to any plant l ∈ {1, 2, . . . , f } for processing, and each plant is equipped with the same m machines, M = {M 1 , M 2 , . . . , M m }, which correspond to the same assembly line shop for part processing. Operations O ij ∈ {O 1j , O 2j , . . . , O mj } are processed on machine M i , i = 1, . . . , m and require p ij time units. Machine M ij can process at most one job at a time. The production and transport operations O Tj will be executed on machine M T and take p Tj time units. The rule is that in each assembly permutation flowshop, all jobs need to be processed on the same path in the order of machine i, which is equivalent to first on machine M 1 , then on M 2 . This goes on, until the end on machine M m . Parts belonging to the same product must be processed continuously, and no partial parts are allowed to pass through. All jobs start timing at time t = 0 and on machine i = 1.
The subsequent phase is the transport stage, which means that after the completion of the first one in the first stage of the product operation, the transport machine collects all parts of the product and moves to the assembly stage.
In the assembly stage, there are s products and an assembly machine M A . Each product r ∈ {1, 2, . . . , s} has a defined part of the operation. The assembly operation O Aj will be executed on the machine M A and uses p Aj time units. The rule is that the assembly of a product can only start when the work contained in the product is completed and the assembly machine is idle. Continuous processing stages are part of the same product operation. The next product operation can only be performed after all jobs belonging to a product have been processed.
From this, it is clear that the objective function of the problem is to determine the allocation of products to plants, and the sequence of parts and products to each plant in order to minimize the completion time or maximum completion time for all products. Based on the above description of the problem, the minimization of the completion time is denoted as F m |nwt|C max . To satisfy the above constraint, a feasible schedule π = {π 1 , π 2 , . . . , π n } is found for n jobs, such that the completion time C max (π) is minimized. This is the same where C max (π) is also equivalent to the time to complete the last job on the last machine. The Equation (1) is as follows.
In the transportation phase, it is necessary to ensure that the completion time distance between two adjacent jobs is determined by the processing time of the two processes, independent of the other jobs in the arrangement. Therefore, the completion time distance is defined between each pair of jobs. The completion time distance between two adjacent job processes π k−1 and π k is calculated as In order to build a mathematical model based on the above description, the constraints, parameters and variables are as follows.
Subject to, r ∈ {1, 2, · · · , s}, z r = r ∑ t=0 n t (9) , 2, · · · , m}, j ∈ {1, 2, · · · , n}, k ∈ {0, 1, 2, · · · , n}, j = k (11) C r C t + p r + (Y r,t − 1)g r ∈ {1, · · · , s}, t ∈ {0, · · · , s}, r = t X j,k ∈ {0, 1}, j ∈ {0, · · · , n}, k ∈ {0, · · · , n}, j = k Y r,t ∈ {0, 1}, r ∈ {0, · · · , s}, t ∈ {0, · · · , s}, r = t where X jk represents a binary variable. It is equal to 1 if job j is the predecessor of job k. Otherwise, X jk = 0. The same is true for Y rt . The product r is equal to 1 if it is the predecessor of the product t. Otherwise, Y rt = 0. C ij is the completion time of job j on machine i, and C r is the completion time of product r. p r is the assembly time of product r, and z r is the total number of jobs contained in the first r products in the product sequence. In addition, g is a large enough positive number. r t is the number of jobs contained in product t. It is worth noting that C 0,j = C i,0 = C 0 = 0. Equation (1) shows that the goal of the mathematical model is to minimize the completion time. The constraint sets (3) and (4) indicate that each job must have only one preceding job and one following job. Constraint sets (5) and (6) show that the preceding and following jobs of virtual job 0 are forced to be executed f times. Constraint set (7) ensures that a job cannot be both a predecessor and a successor of another job. Constraint set (8) ensures that jobs belonging to the same product are processed consecutively. Constraint set (9) indicates that job j can be processed on machine i − 1 only after it is processed on machine i. Constraint set (10) indicates that job j can only be processed on machine i after completing the processing of job k if job j is a successor to job k, where g is positive and has a sufficiently large scale volume. The constraint sets (11) and (12) ensure that each product must have only one preorder and one subsequent job. Constraint set (13) ensures that a product cannot be both a preorder and a successor of another product. Constraint set (14) ensures that each product cannot enter the assembly stage before the production of the part of the job it contains is complete. Constraint set (15) shows that if product t is a preceding job of product r, then product r cannot start the assembly job until product t is completed. Constraint set (16) defines the minimum completion time. Constraint sets (17) to (20) define the domains of the decision variables.
The model uses a sequence-based variable and a set of (min{ f , s} + 1) virtual parts. The sequence starts with a virtual part and ends with a virtual part. These virtual parts divide all parts into min{ f , s} subsequences, each corresponding to a plant. Parts belonging to the same product are never separated. Thus, the product sequence is implicitly included in the part sequence. For example, s = 4, n = 8, m = 2, f = 2. Table 1 shows the machining times of parts and products. One of the feasible solutions is

Materials and Methods
To improve the local search capability of the algorithm, this section introduces the mathematical representation of feasible solutions, classification of populations, selection mechanism, domain structure and Gaussian learning strategy and elite learning strategy. A hybrid bat algorithm adapted to the three-stage DAPFSP problem is proposed.

Mathematical Expression of the Feasible Solution of the Three-Stage DAPFSP
The representation of feasible solutions is a crucial and important step in every metaheuristic approach. The addition of a transportation process to the DAPFSP solved in this paper. The result is that the original expression for the process permutation of the PFSP is not adequate for this scheduling problem. Therefore, in this section, the feasible solution is divided into a sequence of products and a sequence of s processes based on the model of the three-stage DAPFSP model and the properties of the bat algorithm. Each sequence of these s processes corresponds to a product. The processes of a part can be viewed as the sequence of product assembly on the assembly flowshop.
In the production and transportation stages, parts need to be allocated one by one according to the product order. When allocating parts, their processes are sequentially arranged to the corresponding factories according to the process order of the product, and a minimum manufacturing cycle is obtained. Only after all the processes included in the current part are completed can the next part be produced. When a product has completed all processes, the product can proceed to the assembly stage for the installation and assembly stages. As an example, suppose a set of sequences f is used to represent a feasible solution, and each plant has one and only one solution. Each sequence is an ordering of all the parts assigned to that plant, indicating the order in which the parts enter the flow shop. Thus, the order of processed products in the assembly stage is also implied, and the feasible solution can be expressed as π = π 1 , π 2 , . . . , π f , where π k = π k,1 , π k,2 , . . . , π k,ηk , k = 1, 2, . . . , f is the sequence associated with sequence associated with plant k and η k is the total number of parts assigned to plant k. For the example in Section 3, the representation corresponding to this feasible solution is π = {π 1 , π 2 } and In the feasible solution of the original DAPFSP problem, the first stage represents the arrangement of all N processes, and the second stage represents the plants to which these N processes are assigned respectively. In this section, the concept of product priority is introduced to determine the order of product assembly, and a novel feasible solution of the three-stage DAPFSP is proposed. Specifically, the feasible solution in the first stage is expressed as the processing priority of the product, and the parts assembly of each product is equal to the number of processes that constitute that product. The specific steps are, firstly, calculating the total processing time of the product at each stage and randomly initializing the sequence of products. Second, the first two products in the initial product sequence are selected for evaluation, and whichever sequence is better is chosen as the current sequence. Again, this is done by placing it among the product sequences that are already scheduled properly. Finally, an optimal schedule can be obtained to select the current best sequence for the next iteration of the process. The second stage is represented as the processing priority of all processes, where the processes belonging to the same product are arranged in a random order. The smaller the number of its columns, the higher the priority of the element values; the third stage is the plant priority assignment vector, where each part is represented as the plant assigned to the corresponding process. The specific steps are to assign a part of the processes to the factories one by one in order, then select a process and find the minimum completion time by placing it after the last process in each factory. The best part assignment is selected for the next iteration. Table 2 describes how the three-stage DAPFSP representation is decoded into a schedule. Suppose there are 3 products (s = 3) consisting of 9 jobs (n = 9) processed in 3 plants According to the last two stages, called job sequence and factory assignment, it can be observed that jobs 5, 4, and 8 are assigned to the first factory and are processed in the order of 5, 4, and 8 according to the priority specified in layer 2. Then, it is instantly obtained that the partial scheduling of the factories can be decoded as π 1 = {5, 4, 8}. Similarly, the partial scheduling of the remaining two plants can be decoded as π 2 = {2, 7, 9} and π 3 = {6, 3, 1}, respectively. According to the values of the first stage specifying the product assembly priority, the value of 1 in the table corresponds to jobs 5 and 7 belonging to product 3. It can be concluded that after the processing stage, product 3 should be assembled first. In a similar way, it can be concluded that product 1 is assembled for the second time immediately after product 2. In the context of the traditional representation of feasible solutions, the order of assembly of all products follows a first-come, first-served rule. This results in products being ranked in the processing stage in ascending order of the total completion time of their component operations. In contrast, the priority of products in the assembly phase is determined by the order of arrival at the assembly, and there is no way to specify it. This observation is confirmed by a study of the two-stage assembly scheduling problem by Tozkapan [26]. The efficiency may be higher if the sequence of operations in the processing phase is different from the sequence of operations in the assembly phase under the total delay objective. With a very late delivery date, for example, the assembly of the product may be delayed, even if the entire processing phase of the product's component operations has been completed.

Population Classification
The optimization process is divided into two stages. The first stage is that when the individual bat is in a better search position and is close to the optimal solution, its loudness and pulse emission rate reach the best state. This process is called the search phase, its population is called the search-type population. The second stage is the population of bats in a disadvantaged search position; that is, the capture population, so two populations with different functions are obtained. After each iteration, by updating the search direction and step length, the loudness and pulse emission rate of the bat algorithm are improved to find the optimal solution. By adjusting the weights and deviations of the network, the difference between the average value and the standard deviation of the bat algorithm can be minimized.

Search-Type Population
The combination of the back propagation (BP) algorithm based on mean square error (MSE) and gradient descent method minimizes the mean square error [27]. The formula based on the mean square error measurement is shown in (21): where d(n) represents the n-th element of the required signal, and y(n) represents the nth actual output. In the training process, the weight vector in the (t + 1) iteration is updated by (23). Where α is the learning rate or step length, W t is the weight vector of the previous iteration and g t is the gradient vector, which can be calculated by (22) and (23).
In (23), e is the MSE error output in the t-th step of the training process and ∂e/∂w is derived from the MSE error on each element of the w vector. The weight is expressed as w t+1 = w t − α t g t , α t will be adjusted to an appropriate value to achieve better convergence. At this time, suppose the probability pops of the search-type population, its loudness and pulse emission rate are updated to:

Captive Population
As a branch of the gradient descent method, the conjugate gradient method (CG) is applied to nonlinear unconstrained optimization problems. This method has strong local and global convergence [26][27][28]. According to the cloud computing resource scheduling problem, the CG method uses (26) to generate a weight sequence: where α t is the result of the line search method, which can be an exact line search or an inaccurate line search, which is expressed as a step size here. In (26), d t is in the descending direction, which is expressed as the search direction here, and its formula is shown in (27): In (27), β t is the conjugate parameter, g t+1 is the gradient of the objective function concerning the weight at step t + 1, and represents the direction of the last step, and the first step is d 0 = −g 0 .
In the iterative process, the loudness A i and the pulse emission rate r i will also change. As the bat moves closer to its prey, its loudness will decrease, and the pulse emission rate will increase. At this time, the probability of the catching population is pop h and pop s + pop h = 1. In summary, the loudness and pulse emission rate of the bat algorithm are updated to:

Selection Mechanism
When it comes to the bat algorithm as a learning algorithm, each individual in the population needs to learn from the individually optimal and globally optimal individuals in order to find the optimal solution. In the field of multi-objective optimization, there is no single optimal solution, and the optimal selection mechanism needs to be redesigned. Which strategy is used to update the individual optimal p best and the global optimal g best respectively, thus connecting the decision variable space and the objective space, is particularly important for weighing the exploration and exploitation capabilities of the algorithm evolution process. In this paper, we consider fully mining individuals with better diversity and convergence to assist populations to jump out of the local Pareto frontier. Three guides are selected from the searching and capturing populations to guide the entire individual bat flight, and how the guides are selected from the two populations is specified below.
Unlike the traditional bat algorithm, this algorithm first generates a set of θ with ω solutions and performs a local search process for the optimal solution in the set θ. Subsequently, IHBA enters the iterative phase. In the iteration, the solution ϕ is selected using the selection mechanism in the set θ. The solution ϕ is applied to the search phase and two partial sequences can be obtained, one for the solution ϕ Search with the products and processes removed, and the other for the remaining part of the solution ϕ Captive . The local search for the products is applied to ϕ Captive , and the suboptimal solution ϕ Captive is generated. IHBA again reinserts the removed products and jobs into ϕ Captive . This process is called the capture phase, and a complete solution ϕ is regenerated. A local search is performed on ϕ to generate ϕ . Finally, acceptance criteria are used to determine whether the new solution ϕ updates the set.
Therefore, in scheduling problems, the selection mechanism is to select a solution from the solution set θ at the beginning of each iteration, and the selected solution is noted as ϕ, and proceeds to the subsequent stages such as search and capture. In this scheduling problem, the minimum completion time and the number of iterations are often used as selection factors. So, the proposed selection mechanism consists of two random options, both of which have a 50% probability of being selected. That is, two solutions are randomly selected in the set θ and the one with the lower minimum completion time is chosen as ϕ. Alternatively, two solutions are randomly selected in θ, and the one with the lower number of iterations is chosen.
The main focus of p best , g best and osd wizard selection mechanisms is on the target space, implying how to fully exploit the candidate solutions with good convergence and diversity in the target space. These wizards serve as a bridge between the target space and the decision variable space, and the new wizard selection mechanism can weigh the ability of exploration and exploitation of the decision variable space in an integrated way.

Three Neighborhood Structures Based on Insert Operator
To solve the three-stage DAPFSP problem, this paper introduces three neighborhood structures based on the INSERT operator, which constitutes the three-stage bat algorithm (3SBA). The insert operator is considered by many scholars as a superior performance operator, which is described as follows. • Product-based Neighborhood (PBN) Regarding the order of product assembly, a randomly selected product is inserted into another random position. • Factory-based Neighborhood (FBN) A process is randomly selected from the representation of the solution and assigned to another randomly selected plant. For this plant, all possible positions where jobs can be inserted are considered and the part of the job sequence with the earliest completion time is selected. It is worth noting that the sequence of jobs assigned to other plants remains unchanged during this process. • Job Sequence-based Neighborhood (JSBN) A plant is randomly selected and its sequence of jobs is extracted from the solution representation. Next, a job is randomly selected from this partial sequence, and another location is randomly inserted. Finally, the jobs in this partial sequence along with their product priorities and plants are placed in an orderly manner in the distribution solution representation, which is partially the location belonging to that plant.

Local Search Method Based on Variable Neighborhood Structure
Based on the above three neighborhoods named PBN, FBN and JSBN, this section proposes a local search method with variable neighborhood search. Equation n(s) denotes that the neighborhood structure and other parameters are consistent with the original bat algorithm, implying that the maximum number of iterations for each neighborhood at the same frequency is noted as T max and the frequency f is initialized as f 0 . In 3SBA, all three neighborhoods are iterated sequentially, avoiding premature convergence throughout the search process.
Variable neighborhood descent (VND) is the simplest variant of variable neighborhood search [29]. It has been shown to be effective in solving the flow shop [30] and distributed scheduling problems [31]. VND searches for different neighborhood structures from minimum to maximum. First, starting from the initial solution, VND searches the first neighborhood to find the local optimum. Then, while searching the second region, if a better local optimum solution is found, the algorithm returns to the first neighborhood.
Otherwise, it continues searching the third neighborhood and so on. The algorithm does not end until a local optimum is found with respect to all neighborhoods. Therefore, to facilitate the differentiation of the VNDs in this paper, two versions were developed based on the characteristics and features of the bat algorithm. The first one uses three neighborhood structures, named VND3BA, and the second one uses two neighborhood structures, named VND2BA, respectively. A VND that best fits the bat algorithm and the scheduling problem is selected by calibration through simulation experiments.

VND3BA Local Search Method Based on Three Neighborhood Structures
The first method is applied to each product contained in the plant. After removing each part of the product in succession, it is tested at all possible locations within the product. If a better manufacturing span is found, the part sequence is updated. This method runs until all parts contained in the plant have been taken into account and no further improvements are found; then, it can be stopped and ended. It can be named as Local Search Inside Product() and Part Local Search Inside Factory(). The former mainly searches individual products, while the latter searches the whole factory by calling the former, each in its own way.
The second method is a local search of the products within the factory. The principle of this method is to remove the products one by one from the beginning to the end and try to find the best position for them, equivalent to the optimal solution of this problem. If, in the process, a better manufacturing span is found, the sequence of parts in the plant is updated. The process is repeated until no products are found that can be improved.
The third method focuses on finding the most suitable location for a product that crosses plants. First, all plants are checked and the plant with the maximum completion time p is found. Then, the first product is removed and is tested in all possible locations in the remaining plants. If the generated best completion time is lower than the original completion time C p , this product is inserted into the current location. The principle is that the algorithm iterates by finding the plant with the maximum completion time again, until a local optimum is found.

VND2BA Local Search Method Based on Two Neighborhood Structures
Since VND3BA considers the neighborhood of parts and the neighborhood of products separately, some available feasible solutions may be missed. Therefore, in this section, two local search algorithms are proposed to consider these two neighborhoods in an integrated manner.
The first approach explores the interior of the plant by extracting the range of products and finding the best location of the products within the plant. This step is to extend the scope of the local search. This method also adaptively finds the best part order for the product when inserting it.
The second approach exploits the local search capability of the bat algorithm itself to explore the optimal solution across plants. Therefore, it can be viewed as a combination of two processes named ProductLocalSearchBetweenFactories() and LocalSearchInsideProduct().

Gaussian Learning Strategy and Elite Learning Strategy
Gaussian learning strategy (GLS) [32] and elite learning strategy (ELS) [33] can assist individuals of bats to jump out of the local frontier and improve the local search ability of the algorithm. If the feasible solution is not updated many times, the whole bat population is most likely to fall into local optimum, and then only the GLS reset strategy can be executed, which is given by The collaborative learning mechanism of p best and b best ensures that the exploration and exploitation capability of the entire bat population is enhanced, allowing the algorithm to quickly jump out of the local frontier to approximate the true frontier.
Although the use of GLS can assist individual bats to jump out of the local frontier, this strategy alone still cannot meet the requirements of the bat algorithm for solving the DAPFSP. To further improve the performance of the bat algorithm, ELS is introduced to solve the scheduling problem with the following mathematical expressions.
where x ub (j) and x lb (j) represent the upper and lower bounds of the j-th dimensional decision variable, respectively, and N(0, 1) represents the random number with a mean value of 0 and a variance of 1.

Test Questions and Parameter Settings
In this section, the proposed IHBA is compared with the competitive memetic algorithm (CMA) [34], biogeography-based optimization algorithm (BBO) [35], estimation of distribution algorithm ( EDA) [36], genetic algorithm [19,21], and particle swarm optimization algorithm [20], and only one of these algorithms was selected for comparison from the literature [37]. So far, the three DIWO algorithms have proved to be the state-of-the-art algorithms for the considered problem with the parameters shown in Table 3. All simulation experiments were implemented on the MATLAB r2020a platform. All programs were executed on an AMD A8-7100 Radeon R5@1.80 GHz and 12.0 GB RAM in Windows 10 operating system.  To calibrate the proposed IHBA, a total of 810 instances were randomly generated according to the literature [5,34]

The Results and Discussion
Relative percentage deviations are the absolute deviations of a given measurement as a percentage of the mean, and can only be used to measure the deviation of a single measurement from the mean. In order to compare the efficiency between algorithms, the calculation of relative percentage deviations (RPD) is considered to compare and analyze these metaheuristic algorithms. Its formula is shown in (32).
where C max (π) is the total completion time of the current algorithm, and C max (π) best is the minimum value of C max (π) among all algorithms to be compared. In this paper, the termination condition of the iteration is set to a maximum CPU runtime equal to C × m × n milliseconds, and the results are calculated for C = 20, 40, and 60. Then, the results obtained from the calculation are compared with other algorithms. The methods used are the analysis of variance (ANOVA), which is widely used in parametric statistical procedures, and the Friedman test and Wilcoxon paired signed test, which are commonly used in nonparametric statistical procedures. These methods have been widely used and promoted in the recent scheduling literature.

Comparative Analysis at C = 20
When C = 20, the values of RPD for various heuristic algorithms are shown in Table 4. It can be seen from the bold figures that the value of PRD of the proposed algorithm IHBA in this paper is smaller than that of other algorithms in the vast majority of combinatorial solutions, including the recognized TDIWO algorithm and the original bat algorithm. Only in these 5 sets of data (1)  Meanwhile, IHBA not only obtains the lowest relative percentage deviation value, but also the smallest average RPD value, which is 21.15%, with an error of about 0.83% from the optimal value. This is good proof that the IHBA algorithm has a strong advantage and superiority. It can also be seen that the GA and PSO performance is also very good, with average RPD values of 32.09% and 31.52%. Although TDIWO is not satisfactory in this comparison, it is significantly better than BBO and EDA. The most surprising is the original bat algorithm, with a result of 27.69, which is second only to the algorithm proposed in this paper. This also shows that the bat algorithm itself has better performance and has stronger search capability. Then, the results of the descriptive statistics of the Friedman test were calculated for the case of C = 20, as shown in Table 5, where N is the number of test cases and takes the value of 4050. The minimum value of each item derived in the algorithm is marked in bold. The p-value calculated for the test statistic is 0 when the significance level α = 0.05, which is less than or equal to 0.05. This indicates that there are significant differences between the algorithms used for comparison. The following conclusions can be clearly seen through Table 5. The rank of IHBA is only 2.60, which is close to one-half of CMA, BBO and TDIWO, while it is smaller than the ranks of GA, PSO and the original bat algorithm by 1.39, 1.09 and 0.78, respectively. Although IHBA does not take the best value in terms of maximum value, it means that standard deviation and minimum value are the smallest among all algorithms. The comparison between the groups also shows that the ranking of EDA is even more than three times that of IHBA, indicating the worst performance of the algorithm, which echoes the results and analysis in Table 4. In addition, it can also be seen by the mean and standard deviation that the three algorithms of DIWO are very close to the proposed algorithm. Next, the non-parametric test for paired samples is the Wilcoxon paired signed test method used. This method was developed based on the signed test for paired observations, and is more effective than the traditional test with positive and negative signs alone. The observed data are generally considered to be significantly different when p is less than 5%. When p is greater than or equal to 5%, the difference in the data is considered insignificant. The results of the Wilcoxon paired signed test for this paper are given in Table 6, assuming that there is no difference between each pair of algorithms (α = 0.05). R+ in the table represents positive differences and R− represents negative differences. In the results for each pair of algorithms, the sum of R+, R− and bonding values is equal to n in Table 5. It can also be seen in Table 6 that the p-values in the last column are equal to 0 and all are less than 0.001, indicating that the differences between the algorithms compared are statistically significant with respect to 0. In other words, there is a significant difference between IHBA and the other algorithms. Finally, in order to make the observations more visual and relevant, this section analyzes and interprets the results of the ANOVA analysis for the eight algorithms. The methods used are the mean plot and the minimum difference method interval, and Figure 1 shows the plotted plots with 95% confidence level. It can be very clearly seen that the best performance is IHBA, which beats the other comparison algorithms by a more significant margin. The bat algorithm also has some advantages, while GA and PSO continue to perform consistently, and the EDA algorithm has the worst performance, followed by BBO.

Comparative Analysis at C = 40
The lower the RPD, the better the performance of the algorithms. When C takes the value of 40, the average RPD of these eight algorithms is shown in Table 7. As can be seen in the last row, the average value of IHBA is 22.73%, which is significantly lower than the average value of other algorithms including the original bat algorithm. It shows that the IHBA algorithm is more efficient than the other metaheuristic algorithms, while BA performs second, with an average RPD of 23.57%. Compared with Table 7, it can be concluded that the RPD value of IHBA changes from the original 21.15 to the current 22.73, as the value of C is taken to increase, which shows that the value of RPD of IHBA increases with the increase of the number of iterations and time. In addition, PSO with a mean value of 35.76% is significantly better than GA and the other four heuristics. Meanwhile, EDA has the largest value and its performance is the worst, which is also consistent with the results in Table 4. Table 8 describes the results of the Friedman test calculations. As can be seen at a glance from the rankings, IHBA has a mean rank of 3.06, which is the smallest value among these algorithms. The next best performer is TDIWO, which also has a better performance, but its standard deviation is 0.89, which is larger than CMA, GA, and PSO. It can be seen from the mean and standard deviation that IHBA has values of 0.79 and 0.76, respectively, which are also the smallest values among all algorithms. The performance of EDA is still the worst, with its average rank, mean, standard deviation, and maximum value all being at a disadvantage. The performance of EDA is still the worst, with its ranking, mean, standard deviation and maximum value being at a disadvantage. In particular, the mean value is as high as 32 times that of IHBA. The p-value in the last row is 0, which is less than 0.05, indicating that the difference between the algorithms has statistical significance. This difference is not due to chance sampling, but rather, the difference between the two groups of algorithms is significant. The results obtained from the Wilcoxon paired signed test are shown in Table 9. The p-values in the last column are all equal to 0, which is less than 0.05. It shows that the differences between these algorithms are statistically significant, and that there are significant differences between IHBA and the other algorithms. This also reconfirms the conclusion that C = 20. More definitely, it explains that IHBA is an effective group intelligence algorithm with outstanding performance. Finally, in this section, EDA is excluded from the mean plot. The reason is that the algorithm is too intrusive and its performance is too poor. As can be seen in Figure 2, IHBA is once again stronger than the other metaheuristic algorithms by a margin. Once again, the variability between the algorithms is proven to be at a significant level. Since the LSD test has the highest sensitivity, Figure 2 verifies from the side that IHBA is an algorithm that performs well and can reach Pareto optimality.  Table 10 presents the results of the calculations at C = 60. Table 10 shows the validity of the IHBA, which yielded an overall mean RPD value of 31.55%. The results of the Friedman test in Table 11, the results of the Wilcoxon paired sign test in Table 12, and the mean plot in Figure 3 again show that the proposed IHBA performed best in the comparison.  Figure 3. Mean plot and 95% LSD interval of the bat algorithm at C = 60.

VND Method
In this section, the proposed VND3BA and VND2BA are compared with VNDH12, VNDH22, and VNDH32, proposed by Hatami et al. These are 3 different versions of the VND method using H12, H22 and H32 to generate the initial solution. The VND method uses product local search to improve the product sequence and part local search to improve the part sequence for each product. Tables 13 and 14 show the computational results and CPU time for the 5 VND algorithms.
As the table shows, VND3BA and VND2BA perform significantly better than VNDH12, VNDH22 and VNDH12 in terms of solution quality and computational effort. The best overall average RPI value of 15.233% for all instances obtained by the existing VND algorithm is more than three times higher than the best overall average RPI value obtained by VND3BA and VND2BA. For all values of n, m, f and s, the proposed VND algorithm produces better results than the existing VNDH12, VNDH22, and VNDH32. The VND method shows stable advantages without considering the different n, m, f , and s involved. The proposed VND algorithm is also very fast, with an overall average CPU time of 0.009 and 0.057 s for VND3BA and VND2BA, respectively, compared with an overall average CPU time of more than 5 s for the existing algorithms.

Conclusions
To address the challenge that existing heuristic algorithms and meta-heuristic algorithms cannot fundamentally solve the three-stage distributed assembly permutation flow shop optimization problem, this paper proposes a hybrid bat algorithm optimization algorithm based on variable neighborhood structure and two learning strategies to minimize the completion time of this problem. The algorithm designs a search-based and capturebased two populations to solve the difficult trade-off between convergence and diversity of the bat algorithm in solving the scheduling optimization problem. Moreover, by fully mining the population information, a new selection mechanism and a new velocity and location update strategy are designed to solve the difficult trade-off between exploration and exploitation when the bat algorithm solves the scheduling optimization problem. The Gaussian learning strategy and elite learning strategy are used to assist the whole population to jump out of the local optimal frontier. Simulation results show that IHBA can solve the optimization problem of three-stage distributed assembly permutation flow shop scheduling well, and performs better than existing algorithms in the literature.
In future research, the algorithm is extended to more complex multi-objective optimization problems to further improve the efficiency of iterative search, and the proposed algorithm is applied to other scheduling problems.