A Novel Fast Parallel Batch Scheduling Algorithm for Solving the Independent Job Problem

With the rapid economic development, manufacturing enterprises are increasingly using an efficient workshop production scheduling system in an attempt to enhance their competitive position. The classical workshop production scheduling problem is far from the actual production situation, so it is difficult to apply it to production practice. In recent years, the research on machine scheduling has become a hot topic in the fields of manufacturing systems. This paper considers the batch processing machine (BPM) scheduling problem for scheduling independent jobs with arbitrary sizes. A novel fast parallel batch scheduling algorithm is put forward to minimize the makespan in this paper. Each of the machines with different capacities can only handle jobs with sizes less than the capacity of the machine. Multiple jobs can be processed as a batch simultaneously on one machine only if their total size does not exceed the machine capacity. The processing time of a batch is determined by the longest of all the jobs processed in the batch. A novel and fast 4.5-approximation algorithm is developed for the above scheduling problem. For the special case of all the jobs having the same processing times, a simple and fast 2-approximation algorithm is achieved. The experimental results show that fast algorithms further improve the competitive ratio. Compared to the optimal solutions generated by CPLEX, fast algorithms are capable of generating a feasible solution within a very short time. Fast algorithms have less computational costs.


Introduction
How to reduce the production cycle and improve the utilization rate of resources is an important problem under the constraints of workshop production, such as delivery time, technical requirements and resource status, etc. Most enterprises adopt workshop scheduling technology to solve this problem. An effective scheduling optimization method can take advantage of many production resources in the workshop. The research and application of a workshop scheduling optimization method has become one of the basic contents of advanced manufacturing technology [1][2][3].
Batch processing machines (BPMs) are widely applied in many enterprises, for example, steel casting, chemical and mineral processing, and so on [4][5][6]. BPMs scheduling problem is a hot topic in workshop scheduling problem. In the traditional scheduling problem, each machine can only process, at most, one job at a time [7]. However, BPMs can process a number of jobs simultaneously as a batch Before we move on, let us introduce some useful notations and terminologies. Let J i = J j ∈ J K i−1 < s j ≤ K i , and i = 1, 2, . . . , m, j = 1, 2, . . . , n; when i = 1, K 0 was used, but it was meaningless, so we set K 0 = 0. It is possible that J i = φ for some i. We have J = ∪ m i=1 J i . Let a j = i denote the index of a machine with the minimum capacity that can process the job J j ∈ J j , and then J j can be assigned to each machine in M j = M a j , M a j +1 , . . . , M m where 1 ≤ a j ≤ m. M a j , M a j +1 , . . . , M m are called the golden machines for job J j , M j is the golden machines set, J j is called the golden job for M ∈ M j , and all of the jobs that can be processed by M i are called the golden jobs set for M i . In a scheduling process, the running time of the machine is equal to the total processing time of batches scheduled on this machine.
The structure of the paper is as follows. Section 2 reviews the previous research in related areas. Section 3 gives the definition of the research problem. In Section 4, a novel fast 4.5-approximation algorithm is proposed for problem P s j , p − batch, K i C max . In Section 5, a fast 2-approximation algorithm is proposed for problem P s j , p j = p, p − batch, K i C max . Section 6 designs several computational experiments to show the effectiveness of fast algorithms. Finally, conclusions are given in Section 7.

Literature Review
Since the 1980s, scholars have studied the job scheduling problem of parallel batch machines extensively [1]. In this section, we review the results of research dealing with different job sizes and minimization of the maximum completion time [14][15][16][17][18][19][20].
In the one-machine case of problem P s j , p − batch, K i C max , we denote 1 s j , p − batch, B C max . Uzsoy [21] proved that 1 s j , p − batch, B C max is a strong NP-hard (non-deterministic polynomial) problem, and presented four heuristics. Zhang et al. [22] also proposed a 1.75-approximation algorithm for 1 s j , p − batch, B C max .
Dupont and Flipo presented a branch and bound method for 1 s j , p − batch, B C max . Dosa et al. [23] presented a 1.7-approximation algorithm for 1 s j , p − batch, B C max .
Li et al. [24] presented a (2 + ε)-approximation algorithm for 1 r j , s j , p − batch, B C max (the more general case where jobs have different release times), where ε is a number greater than 0 and is arbitrarily small.
The special case of P s j , p − batch, K i C max is where all K i = B (B < n) is represented as P s j , p − batch, B C max . Chang et al. [25] studied P s j , p − batch, B C max and provided an algorithm that is based on the simulated annealing approach.
Dosa et al. [23] demonstrated that although the processing time for all jobs is the same (unless P = NP), P s j , p − batch, B C max cannot be approximated to a ratio less than 2. Dosa et al. presented a (2 + ε)-approximation algorithm. Cheng et al. [26] presented a 8/3-approximation algorithm for P s j , p − batch, B C max with running time O(n log n). Chung et al. [27] developed a mixed integer programming model and some heuristic algorithms for P r j , s j , p − batch, B C max (the problem where jobs have different release times). A 2-approximation algorithm for P r j , s j , p j = p, p − batch, B C max (the special case of P r j , s j , p − batch, B C max where all jobs have the same processing times) was given by Ozturk et al. [28]. Li [29] obtained a (2 + ε)-approximation algorithm for P r j , s j , p − batch, B C max .
More recently, several research groups have focused on the scheduling problems on parallel batch machines with different capacities and applications in many fields [30][31][32][33][34][35][36][37][38][39][40][41]. The special case of P s j , p − batch, K i C max , where all s j ≤ K 1 (i.e., all jobs can be assigned to any machine), is denoted as P s j ≤ K 1 , p − batch, K i C max . Costa et al. [30] studied P s j ≤ K 1 , p − batch, K i C max and developed a genetic algorithm for it. Wang and Chou [31] proposed a metaheuristic for P r j , s j ≤ K 1 , p − batch, K i C max (the problem where jobs have different release times). Damodaran et al. [32] proposed a PSO method for P s j , p − batch, K i C max . Jia et al. [33] presented a heuristic and a metaheuristic for P s j , p − batch, K i C max . Wang and Leung [34] analyzed the problem P s j , p j = 1, p − batch, K i C max where each job has its own unit processing time. They designed a 2-approximation algorithm for the problem. They also obtained an algorithm with asymptotic approximation ratio 3/2. Li [35] proposed a fast 5-approximation algorithm and a (2 + ε)-approximation algorithm for P s j , p − batch, K i C max , but the presented (2 + ε)-approximation algorithm has high time complexity when ε is small. Jia et al. [36] presented several heuristics for P r j , s j , p − batch, K i C max (the problem where jobs have different release times) and evaluated the validity of the heuristics by computational experiments. Other methods have also been proposed in the literature [42][43][44][45][46][47][48][49][50][51][52][53].
In this paper, a novel fast 4.5-approximation algorithm was developed for problem P s j , p − batch, K i C max , and we evaluate the algorithm performance via computational experiments. We also provide a simple and fast 2-approximation algorithm for the case that all jobs have the same processing time, ( P s j , p j = p, p − batch, K i C max ), improving upon and generalizing the results in [54][55][56][57]. The approximation ratio of the 2-approximation algorithm in this paper is equal to the presented algorithm in [26], but is now simpler to understand and easier to implement.

Mathematic Formulation of the Problem
In this section, we present the problem under consideration as a mixed integer linear programming (MILP) model. First, the problem parameters and decision variables are given, and then the model is provided. Table 2 shows the problem indices.  Table 3 shows the problem decision variables. Table 3. Decision variables.

Decision Variables Description
x jil 1, if job J j is assigned to the lth batch processed on machine M i ; 0, otherwise. y il the processing time of lthbatch processed on machine M i . C max makespan.
The Objective Function (1) shows that our aim is to find a schedule to minimize the makespan C max . Constraint (2) is to make sure that each job is assigned exactly to one machine. Constraint (3) guarantees that all batches are feasible; in other words, the total size of all jobs assigned to the batch does not exceed the capacity of machine where the batch is scheduled. Constraint (4) indicates that the processing time of a batch is not less than the processing time of the jobs in the batch. Constraint (5) guarantees that the makespan of the schedule is not less than maximum load of all the machines. In Constraint (6), the 0-1 variable x jil indicates whether the jth job is assigned into the lth batch on machine M i (x jil = 1) or not (x jil = 0).

5-Approximation Algorithm for P s j , p − batch, K i C max
We denote the optimal makespan of the problem P s j , p − batch, K i C max as OPT. The main focus of the research is to develop a fast scheduling model to get a minimized makespan as close to OPT as possible.
To solve the problem P s j , p − batch, K i C max , we used the MBLPT (modified longest processing time batch) rule [35], a modification of the BLPT (longest processing time batch) rule. For a given jobs set J i that can be assigned to machine M i , we apply the MBLPT rule, which sorts jobs to get J i . We build a batch B i,1 on machine M i , and then the rule repeatedly pops the first job from J i and assigns it to B i,1 until the sum of all the jobs assigned to B i,1 just exceeds the capacity of M i . Batch B i,1 is called the one-job-overfull batch. Once the one-job-overfull batch exists, a new batch should be built on the same machine, unless the machine runs out of maximum completion time (maximum completion time is in the initialization parameters of the algorithm). We repeat the above job assignment procedure until the job list J i is empty.
. . , h i denote the set of batches generated using the MBLPT rule to J i and machine M i , and h i is the total number of batches scheduled on machine M i . Let p(B i,g ) and p s (B i,g ) denote the longest processing time (the processing time of batch B i,g is equal to the longest processing time of jobs on it) and the shortest processing time of the jobs in batch B i,g , respectively, such that (7) below (refer to [27]) is easy to prove.
By the Inequality (7), we have We now propose the 4.5-approximation algorithm for P s j , p − batch, K i C max . Similar frameworks have been used in [58][59][60][61][62]. In [58], Ou et al. developed a 4/3 approximation algorithm to solve classical scheduling problems with minimized maximum completion time on parallel machines with processing set constraints. In [59], Li proposed a 9/4-approximation algorithm for P s j = 1, p − batch, K i C max (the special case of P s j , p − batch, K i C max where all s j = 1). The algorithm to be described extends the previous research by involving non-identical job sizes.
We first run the 5-approximation algorithm for P s j , p − batch, K i C max . The algorithm generates a feasible schedule with a maximum completion time of UB ≤ 5OPT in O(n log m + n 2 ) time. Let the minimum completion time LB = UB/5. We have LB ≤ OPT ≤ UB. We use the binary search method to find the makespan of a feasible solution in the range of the [LB, UB] interval. Firstly, set T = LB+UB 2 , and then classify both the jobs and batches into long, short, and median. A job J j is long if p j > T/2, . Certainly, long batches may contain median and short jobs, and median batches may contain short jobs. After classification, we use the following SCMF-LPTJF (smallest capacity machine first processed and longest processing time job first processed) procedure to search for a schedule with a makespan at most 9T/4, which permits one-job-overfull batches. If our above operation fails, we will continue searching for the upper half of the interval and set LB = T; otherwise, we will continue searching for the lower half of the interval, record OPT = T, and set UB = T. The binary search method is then repeated in the new range of the [LB, UB] interval until LB ≥ UB.

Lemma 2.
If OPT ≤ T, then the SCMF-LPTJF algorithm will generates an optimal schedule for P s j , p − batch, K i C max with one-job-overfull batches whose makespan is at most 9T/4.

Proof.
Let Σ be an optimal schedule whose makespan is OPT. Let H be the set of long jobs and median jobs. Algorithm 1. SCMF-LPTJF (smallest capacity machine first processed and longest processing time job first processed) T Output:C max -best found solution, RT-running time 1: Q 0 = φ, AssignedJS = φ // denote the jobs have been assigned to batch as AssignedJS 2: for i = 1 to m do 3: Sort J i according to the rule that processing time of jobs is not increased, denote J i 4: end for 5: for i = 1 to m do 6: Apply the MBLPT rule to Q i and M i , get B i = B i,g : g = 1, 2, . . . , h i 8: Sort B i according to the rule that processing time of batches is not increased, denote B i 9: Denote long batches set, median batches set, and short batched set as LongBS, MedianBS, and ShortBS 10: In Σ, each machine can process up to three median batches or one long batch and one median batch. On the other hand, the SCMF-LPTJF program will allocate a long batch on the machine as much as possible. After it assigns a long batch on a machine, this machine still has enough time (at least 5T/4 time) to handle at least two median batches. Note that the SCMF-LPTJF procedure forms batches greedily. (It overfills each batch with the longest currently unassigned jobs.) Therefore, the SCMF-LPTJF procedure allocates more processing times for long jobs and median jobs on the machines with smaller capacities than Σ does. Equivalently, we claim that Therefore, if there is OPT ≤ T, but there is still a job j when executing to the end of the SCMF-LPTJF process, then job j must be a short job. When job j is assigned, all of machines M a j , M a j +1 , . . . , M m have a load greater than 2T. Let i max < a j be the largest index such that machine M i max has a load less than or equal to 2T. If all of the machines have load greater than 2T, then set i max = 0. Therefore, all of machines M i max +1 , M i max +2 , . . . , M m have load greater than 2T. There is room on the machine M i max for scheduling any short job. Hence, by the rule of the SCMF-LPTJF procedure, no short job from It follows that In Σ, all the short jobs in In addition, we have also proved that the overall processing time Finally, we will get a schedule with one-job-overfull batches (Figure 1a) whose makespan is at most 9OPT/4. We can turn it into a viable scheduling solution (Figure 1b), where the maximum completion time is 9OPT/2, as follows: for each one-job-overfull batch, move the last packed job into a new batch and calculate the new batch on the same machine. Since the iterative number could be O(log( n j=1 p j )), then the following theorems are obtained. The algorithm performs a binary search within the range [ , ] LB UB . Finally, we will get a schedule with one-job-overfull batches (Figure 1a) whose makespan is at most 9 /4 OPT . We can turn it into a viable scheduling solution (Figure 1b In order to achieve a strongly polynomial time algorithm, we use a technique described to modify the above algorithm slightly. Therefore, the following theorems are obtained.

Theorem 1.
There is a 4.5-approximation algorithm for P s j , p − batch, K i C max that runs in O(n 2 + mn log p sum ) time, where p sum = n j=1 p j .
In order to achieve a strongly polynomial time algorithm, we use a technique described to modify the above algorithm slightly. Therefore, the following theorems are obtained.

Theorem 2.
There is a (4.5 + ε)-approximation algorithm for P s j , p − batch, K i C max that runs in O(n 2 + mn log(1/ε)) time, where ε > 0 can be made arbitrarily small.

A 2-Approximation Algorithm for P s j , p j = p, p − batch, K i C max
In this section, we study P s j , p j = p, p − batch, K i C max , i.e., the problem of minimizing the makespan with equal processing times (p j = p), arbitrary job sizes (may exceed the processing power of certain batches), and non-identical machine capacities.
The 2-approximation algorithm is called LIM (largest index machine first consider). It groups the jobs in J m , J m−1 , . . . , J 1 (this ordering is crucial), respectively, into batches greedily. During the run of the algorithm, Load i represents the load on machine M i , i.e., the overall processing time of the batches on M i , i = 1, 2, . . . , m. The algorithm dynamically maintains a variable x, which represents the currently largest index such that Load x < Load m . If there is no such index, then we can set x = m. We can assign the next generated batch to machine M x . if b is not empty and b.size ≤ K x then 23: while b.size ≤ K x and J i−1 J i−2 · · · J 1 − AssignedJS is not empty do 24: get first job j from J i−1 J i−2 · · · J 1 − AssignedJS 25: assign job j to b 26: b.size = b.size + j.size 27: remove job j from J i−1 ||J i−2 ||· · ·||J 1 and add it to AssignedJS

Theorem 3.
Algorithm LIM is a 2-approximation algorithm for P s j , p j = p, p − batch, K i C max .
Proof. Let Σ 1 be the schedule with makespan SOL 1 generated by LIM after Step 2. In Σ 1 , all batches can be processed at the same time as they are assigned to a machine. During the running of the algorithm, the load on any machine is always less than or equal to the load on M m . Therefore, M m finishes last in Σ 1 . Let B last be the last batch assigned to M m . Let M l , M l+1 , . . . , M m be the processing set of B last , which can be defined as the largest size processing set of the job in B last . In Σ 1 , let S(B last ) denote the start time of B last . We have: SOL 1 = S(B last ) + p.
Since we assigned B last to M m , at that moment x = m must hold. Hence, machines M l , M l+1 , . . . , M m are busy in the time interval (0, S(B last )). All the batches allocated to machines M l , M l+1 , . . . , M m before S(B last ) are one-job-overfull batches. All jobs in these batches, together with the largest size job in B last , must be processed on machines M l , M l+1 , . . . , M m in any feasible schedule. Hence, we get OPT ≥ S(B last ) + p. So we can draw the conclusion that SOL 1 ≤ OPT.
For a feasible schedule with makespan SOL generated by LIM, we have SOL ≤ 2SOL 1 ≤ 2OPT.

Experimental Environment
For the performance evaluation of the 4.5-approximation and 2-approximation algorithms, all the instances are generated by a random algorithm, as in the papers [63][64][65][66][67][68]. In the process of the instances generation, five factors affecting the solution of the problem are determined: the number of jobs, the number of machines, the variation in job sizes, the variation in job processing time, and the variation in machine capacities [69][70][71][72][73][74][75].
The experiment is divided into two parts: (1) the 4.5-approximation algorithm is compared with the CPLEX result. (2) The 2-approximation algorithm is compared to CPLEX. The 4.5-approximation algorithm and 2-approximation algorithm were coded in C# and the CPLEX was programed by OPL (Optimization Programming Language), compiled, and run with the IBM ILOG CPLEX Optimization Studio 12.5.1.0 (Education Version). All the algorithms were run on the same machine (Win10, Intel (R) i7-4790, 16 GB).
First, we set the number of machines to two or four, and the capacity of each machine is represented by a uniform integer [10,40]. Then, random problem instances with number of jobs equals to 10, 20, 50, 100, 200, and 300 are generated, and each job processing time P j is generated by random sampling from a uniform distribution [1,10]. The factor settings of the experiment are summarized in Table 4. Table 4. Factors setting of the experiment.

Factors Levels
Number of jobs (n) 10, 20, 50, 100, 200, 300 Number of machines (m) 2, 4 Size of jobs (s) [1,10], [11, max(K i )] Processing time of jobs (P) [1,10] Capacity of machines (K) [10,40] We combine the parameters and randomly generate 50 instances for each combination (a test suite). Each test suite is denoted by a code. For instance, a test suite with 50 jobs and two machines is denoted by J3M1S1P1K1.

Comparison of 4.5-Approximation Algorithm and CPLEX
Here, a CPLEX algorithm is used to solve the MILP model given in Section 3, and we compare the CPLEX algorithm with the results of the 4.5-approximation algorithm. CPLEX always gives the optimal solution, but it cannot give the optimal solution for all instances even after operating several hours. Therefore, we set an upper execution time 1800s for CPLEX, and the best-known solution was compared. The job size and machine capacity distribution as shown in Figure 2.   [10,40] We combine the parameters and randomly generate 50 instances for each combination (a test suite). Each test suite is denoted by a code. For instance, a test suite with 50 jobs and two machines is denoted by J3M1S1P1K1.

Comparison of 4.5-Approximation Algorithm and CPLEX
Here, a CPLEX algorithm is used to solve the MILP model given in Section 3, and we compare the CPLEX algorithm with the results of the 4.5-approximation algorithm. CPLEX always gives the optimal solution, but it cannot give the optimal solution for all instances even after operating several hours. Therefore, we set an upper execution time 1800s for CPLEX, and the best-known solution was compared. The job size and machine capacity distribution as shown in Figure 2.  Figure 3 shows the result of test suite J1M1S1P1K1.  Regarding the 4.5-approximation algorithm, LB and UB are initialized as follows: Figure 3 shows the result of test suite J1M1S1P1K1.   Figure 3d shows that the run time of the 4.5-approximation algorithm is clearly better than CPLEX. Table 5 shows the results of all test suites. Though CPLEX is the best solver for linear programming problems, it cannot give an optimal solution for a long time, so we terminated CPLEX after running for 1800 s and used the best integer for comparison.
The results illustrate that the 4.5 approximation algorithm is more effective than CPLEX in any scale test-suite. For the small-scale test-suite (10 jobs and two machines), the best solution obtained by the 4.5-approximation algorithm is closest to the CPLEX best solution. For the medium-scale and large-scale test-suites, the average result of the 4.5-approximation algorithm is no bigger than 4.5T . Then, the LB and UB are denoted as 8 LB = *8 UB n = . Table 6 show the experimental results given by the CPLEX and the LIM algorithm for all the test suites. Column SQL-AVG (the average value of SQL) reports the average makespan obtained using the LIM algorithm. Compared with the CPLEX makespan, the LIM algorithm can obtain the efficient solution in only little running time (Column Run Times). C max is the makespan of the 4.5-approximation algorithm with one-job-overfull batches and C max is the makespan of 4.5-approximation algorithm with a feasible schedule. Figure 3a shows the makespan of the 4.5-approximation algorithm with one-job-overfull batches and the algorithm with a feasible schedule. Figure 3b,c shows that the 4.5-approximation algorithm substantiates the feasibility of this research method:

Comparison of 2-Approximation Algorithm (LIM) and CPLEX
C max ≤ 9T/4 C max ≤ 4.5T. Figure 3d shows that the run time of the 4.5-approximation algorithm is clearly better than CPLEX. Table 5 shows the results of all test suites. Though CPLEX is the best solver for linear programming problems, it cannot give an optimal solution for a long time, so we terminated CPLEX after running for 1800 s and used the best integer for comparison.
The results illustrate that the 4.5 approximation algorithm is more effective than CPLEX in any scale test-suite. For the small-scale test-suite (10 jobs and two machines), the best solution obtained by the 4.5-approximation algorithm is closest to the CPLEX best solution. For the medium-scale and large-scale test-suites, the average result of the 4.5-approximation algorithm is no bigger than 4.5T.

Comparison of 2-Approximation Algorithm (LIM) and CPLEX
For the problem P s j , p = p, p − batch, K i C max , minimizing the makespan with equal running times, arbitrary job sizes (which may exceed the processing power of certain batches), and different machine capacities should be the solution. The running time of jobs was set to a default value of 8. Then, the LB and UB are denoted as LB = 8 UB = n * 8. Table 6 show the experimental results given by the CPLEX and the LIM algorithm for all the test suites. Column SQL-AVG (the average value of SQL) reports the average makespan obtained using the LIM algorithm. Compared with the CPLEX makespan, the LIM algorithm can obtain the efficient solution in only little running time (Column Run Times).

Conclusions and Future Works
The paper analyzed the parallel batch scheduling problem of minimizing the makespan, where arbitrary sizes of scheduling jobs are allowed and machines have different capacities. Each machine can only deal with jobs whose sizes do not exceed that machine's capacity. We developed an efficient 4.5-approximation algorithm for this problem. The experimental results show that the algorithms can obtain a reasonable solution in a finite time. A 2-approximation algorithm is achieved under the particular circumstances of equivalent processing times. Computational experiments show that the fast algorithm can help to improve the efficiency of resource consumption and give researchers more choices to balance the quality of the solution and the running time in the parallel batch scheduling problem.
Several important related directions for this problem are worth researching in the future. First of all, how do we improve the fast algorithm to get closer to the optimal solution in shortest time? In addition, jobs with release times are more common BPM problems in the manufacturing industry. How to develop a fast scheduling algorithm for this problem is an import direction. Finally, BPM problems with different service levels can be considered as well.