An Improved Arc Flow Model with Enhanced Bounds for Minimizing the Makespan in Identical Parallel Machine Scheduling

: In this paper, an identical parallel machine problem was considered with the objective of minimizing the makespan. This problem is NP-hard in the strong sense. A mathematical formulation based on an improved arc ﬂow model with enhanced bounds was proposed. A variable neighborhood search algorithm was proposed to obtain an upper bound. Three lower bounds from the literature were utilized in the improved arc ﬂow model to improve the efﬁciency of the mathematical formulation. In addition, a graph compression technique was proposed to reduce the size of the graph. As a consequence, the improved arc ﬂow model was compared with an arc ﬂow model from the literature. The computational results on benchmark instances showed that the improved arc ﬂow model outperformed the literature arc ﬂow model at ﬁnding optimal solutions for 99.97% of the benchmark instances, with the overall percentage of the reduction in time reaching 87%.


Introduction
Scheduling is a decision-making process that deals with optimizing resource allocation to perform a collection of tasks in production or manufacturing processes. Scheduling is involved in many real-life applications, such as jobs in a manufacturing plant, customers waiting for services in front of a teller's window, or airplanes waiting for clearance to land or take off at an airport. Many studies have been carried out by researchers in several scheduling environments, such as single machine scheduling and parallel machine scheduling, to fill some gaps in these problems.
Single machine scheduling requires only the sequencing of jobs. However, scheduling problems with more than one machine, such as parallel machines, involve both resource allocation and sequencing, rather than simple sequencing [1]. The importance of parallel machine scheduling may be viewed from both theoretical and a practical perspectives [2]. From a theoretical perspective, it is a general case of the single machine and a special case of the flexible flow shop. A practical perspective is important since parallel operations are frequent in the real world. The makespan objective in parallel machines is of considerable interest, since it balances the load between the parallel machines.
In the parallel machine scheduling problem, each job is processed on only one machine and each machine can process one job at a time. The identical parallel machine is a variant of the machine scheduling problem that is characterized by a set of n independent jobs J = J 1 , J 2 , . . . , J n to be scheduled in m identical parallel machines M = M 1 , M 2 , . . . , M m , where (m < n). Each job J can be processed in only one machine with a processing time p j that is identical on all the machines. Preemption is not allowed. Therefore, once a machine is processing a job, it must complete its processing without interruption. The objective is to minimize the total completion time of the last processed job in the schedule (makespan).
The identical parallel machine scheduling problem to minimize the makespan is described by Graham et al. [3] in terms of three fields: α|β|γ as P m | |C max , where P m indicates the identical parallel machine environment with m machines, the second empty field, which indicates no preemption or precedence constraints on the jobs, and C max in the third field, which indicates the makespan minimization objective. This problem is proven to be NP-hard [4]. Therefore, efficient exact algorithms were needed to be able to solve large-size instances to optimality in less computational time.
The remainder of this paper is organized as follows: Section 2 presents a literature review of the identical parallel machine scheduling problem and the arc flow model. Section 3 details the methodology used in the problem. The computational results and discussion are provided in Section 4. Finally, the conclusion and future work are shown in Section 5.

Literature Review
The first known heuristic applied to the identical parallel machine was the longest processing time (LPT) rule proposed by Ronald L. Graham [5]. LPT works by sorting jobs in a non-increasing order of processing times and assigning jobs to the least-loaded machine one by one until assigning all the jobs. The worst-case approximation ratio for the LPT rule is 4/3-1/(3m), where m is the number of machines. Recently, Della Croce and Scatamacchia [6] revisited the LPT rule and improved the ratio to 4/3-1/(3m-1). Other popular approximation heuristics combine (P m ||C max ) with bin-packing techniques: MULTIFIT [7], COMBINE [8], and LISTFIT [9]. These heuristics provide a better worst-case performance, but with higher computation times.
Metaheuristics also play an important role in obtaining good solutions for parallel machine scheduling problems. Among these metaheuristics are the genetic algorithm [10], simulating annealing [11], the variable neighborhood search [12], and other metaheuristics in the literature.
Over the past years, researchers have increased their interest in finding efficient exact methods capable of solving large and hard instances to optimality in less computational time. Mokotoff [1] conducted a survey on parallel machine scheduling. Dell'Amico and Martello [13] proposed a branch-and-bound algorithm based on sophisticated lower and upper bounds and some dominance rules for (P m ||C max ). Mokotoff [14] proposed an exact method based on the cutting plane method for (P m ||C max ). Then, Dell'Amico and Martello [15] noted that their previous work [13], which had been published before the work of Mokotoff [14], obtained better results in terms of the computation time and solved all the studied instances to optimality. Dell'Amico et al. [16] presented a scatter search algorithm and exact algorithm based on branch-and-price algorithms, which make use of the duality between P||C max and the bin-packing problem. Haouari et al. [17] proposed lower bounds based on the lifting procedure. The authors also proposed two heuristics that required iteratively solving a subset sum problem. The previous work was extended by Haouari and Jemmali [18], enhanced lower bounds were obtained, a new heuristic was proposed based on solving a sequence of 0-1 knapsack problem, and these bounds were embedded in a branch-and-bound algorithm. To test the performance of their algorithm, the authors identified a set of hard instances for which the ratio between the number of jobs and the number of machines was equal to 2.5. The proposed branch-and-bound solved only about 68% of a total of 240 generated instances with different numbers of jobs and machines.
The arc flow approach has been used recently in classical optimization problems, and allows modeling with a pseudo-polynomial number of variables and constraints. For a cutting-stock problem, de Carvalho [19] proposed a branch-and-price approach for an arc-flow formulation. Next, it was extended for the bin-packing problem in De Carvalho [20]. An alternative arc-flow formulation for the cutting-stock problem was proposed in [21,22], which used a graph compression technique that reduced the size of the constructed graph without affecting the optimal solution. These formulations were recently tested and compared in [23] against several other models and problem-specific algorithms on one-dimensional bin-packing and cutting-stock problems. J. Martinovic et al. [24] compared the arc-flow model with a one-cut model for the one-dimensional cutting-stock problem and presented reduction techniques for both models. M. Mrad et al. [25] proposed a graph compression method to an arc flow formulation for a two-dimensional stripcutting problem. Other applications of arc flow include berth allocation problems [26], vehicle-routing problems [27], and facility location problems [28].
In the area of scheduling, Mrad and Souayah [29] proposed an arc flow formulation for the parallel machine scheduling problem to minimize the makespan. The proposed mathematical model outperformed other proposed methods from the literature, since it solved to optimality most of the hard instances from the literature in a few seconds on average. On the other hand, some hard instances were still unsolved within a predefined elimination time, mainly because the number of jobs was relatively large and the ratio of the number of jobs to the number of machines was greater than or equal to 2.25.
A. Kramer et al. [30] studied the identical parallel machine scheduling problem to minimize the total weighted completion time. An enhanced arc flow formulation with reduction techniques was proposed, which reduced the number of variables and constraints. As a consequence, large instances with up to 400 jobs were solved to optimality and instances with up to 1000 jobs provided a low optimal gap. Then, A. Kramer et al. [31] extended the previous work of A. Kramer et al. [30] by adding a release time constraint for each job. A mixed-integer linear program and a branch-and-price algorithm that relied on the decomposition of an arc-flow formulation and the use of exact and heuristic methods were proposed for solving pricing subproblems. S. Wang et al. [32] studied deterministic and parallel machine scheduling location problems and proposed a network flow-based formulation, two formulation-based heuristics, and one polynomial time algorithm. Trindade et al. [33,34] proposed an arc flow formulation for parallel and single batch processing machine scheduling with non-identical job sizes and a machine capacity. A. Kramer et al. [35] proposed five different formulations for identical parallel machine scheduling with family setup times to minimize the total weighted completion time. The formulations were one commodity formulation, three arc flow formulations, and a set covering formulation. The results showed that one of the arc flow formulations and a set covering formulation yielded a better performance. de Lima et al. [36] conducted a survey on the foundation of the arc flow formulation and showed the relation between their network and dynamic programming. The survey also discussed the main solution methods for solving large-scale arc flow models and their main applications. de Lima et al. [37] proposed a network flow framework to address huge network issues in arc flow model solutions.
To summarize, we can say that so far, there are two kinds of proposed approaches for makespan minimization on parallel machines: heuristics and exact methods. On the one hand, heuristics have the potential to find good solutions in a reasonable amount of time. However, they do not guarantee the optimality of the provided solutions. For instance, one of the most recent heuristics for the studied problem [6] can still be outperformed by older methods for some benchmark instances. On the other hand, exact algorithms deliver optimal schedules, but are limited by the size of the instances (number of jobs/machines) and the relatively high computation time. To illustrate, the state-of-the-art exact method [29] still fails to solve some instances with as much as 154 jobs. Moreover, it requires thousands of seconds to solve problems with less than 200 jobs.
The main aim of this paper was to move towards an efficient implementation of the arc flow model for makespan minimization in an identical parallel machine scheduling problem. Hence, the arc flow model proposed by Mrad and Souayah [29] was considered. The improvements were made by proposing enhanced bounds and graph compression techniques to reduce the number of variables and constraints in the constructed arc flow model. A variable neighborhood search (VNS) algorithm was proposed with five neighborhood structures and an initial solution obtained from the schedule of the longest processing time (LPT) rule as an upper bound. Three lower bounds from the literature were considered for the improved arc flow model. A better upper bound was considered as a graph compression technique since it reduced the size of the graph. As a consequence, the number of variables was reduced. Furthermore, another proposed graph compression technique eliminated scheduling some jobs on the same machine, which made the resulting lower bound of the problem exceed the current upper bound obtained by the VNS algorithm. It is worth noting that devising an improved arc flow formulation has an important application in reducing the computation time for solving hard optimization problems. This can be attested by our experimental results as well as those of previous successful implementations of such techniques [30,31].

Methodology
In this section, the methodology of applying the improved arc flow model with enhanced bounds for the identical parallel machine scheduling problem to minimize the makespan is explained. The first step was to calculate the bounds of the problem. Three lower bounds were applied from the literature and the maximum among them was selected as the lower bound for the problem. The second step was to find an upper bound using the LPT rule. In Step 3, the obtained upper bound from the LPT rule became an input for the proposed VNS algorithm to further improve, if possible, the upper bound. The lower and upper bounds were then used to construct the graph of the improved arc flow model. To build the graph, a graph compression technique was proposed to determine the set of jobs that would have an arc leaving node 0. Then, the whole graph was constructed. After that, a mathematical formulation was proposed for the improved arc flow model. Finally, the mathematical model was solved with Cplex 12.10. If a solution was found within 20 min, then an optimal solution was found. Otherwise, the instance was not solved to optimality within the specified time limit. The flowchart of the methodology is shown in Figure 1. The details for each step are illustrated in the following subsections.

Lower Bounds
The lower bounds used in this paper are presented as follows, assuming that 1 ≤ 2 ≤ ⋯ ≤ . These lower bounds were proposed by Dell'Amico and Martello [13]: Step 1. Calculate a lower bound for the problem.
Step 2. Obtain an initial solution using the LPT rule.
Step 3. Apply the VNS algorithm (Algorithm 1) to the solution obtained in Step 2 to obtain a better upper bound.
Step 4. Build the improved arc flow graph by applying Algorithms 2 and 3.
Step 5. Formulate a mathematical model for the constructed graph in Step 4.
Step 6. Solve the mathematical model with Cplex 12.10.

Start
An optimal solution is found.
If a solution is obtained in less than 1200 sec.

Yes
An optimal solution is not found.

Lower Bounds
The lower bounds used in this paper are presented as follows, assuming that P 1 ≤ P 2 ≤ . . . ≤ P n . These lower bounds were proposed by Dell'Amico and Martello [13]: The best lower bound is the maximum value among them. Therefore,

Upper Bound
In this paper, the variable neighborhood search (VNS) was proposed as an upper bound for the arc flow graph. The VNS is detailed in the following subsection.

Variable Neighborhood Search
The variable neighborhood search (VNS) is a metaheuristic approach proposed by Mladenovi'c and Hansen [38] that performs systematic neighborhood structures to improve the quality of an existing solution. To start the VNS algorithm, an initialization phase is required. In the initialization phase, an initial solution "S" is obtained (randomly or by using a heuristic method), neighborhood structures "N k (S)", k = 1, 2, ..., k max are designed, where k is the index of a neighborhood and k max is the maximum number of neighborhoods, and the stopping criteria are determined.

The Initial Solution
In this work, we proposed the longest processing time (LPT) rule as the initial solution for the VNS algorithm. The LPT rule works by first sorting the jobs in non-increasing order of their processing times and assigning the jobs to the smallest available machine until all jobs have been assigned.

The Neighborhood Structures
The neighborhood structures are methods designed to enhance the local search in the VNS algorithm. To design the neighborhood structures for an identical parallel machine scheduling problem with a makespan objective, two terminologies will be used:

•
Problem machine (Pm): a machine in which the total scheduling time is the makespan. • Non-problem machine (NPm): a machine in which the total scheduling time is less than the makespan.
The five neighborhood structures used in this work are explained as follows: 1. Move (one): move a job i from a problem machine (Pm) to a non-problem machine (NPm) if (makespan of Pm (C Pm )-makespan of NPm (C Pm ) > processing time of job i (p i )); 2.
Exchange (one-one): exchange a job i from a (Pm) with a job j from a (NPm) if Exchange (two-one): exchange two jobs, i and j, from a (Pm) with a job k from a (NPm) Exchange (one-two): exchange a job from a (Pm) i with two jobs, j and k, from a (NPm) if (P i − (p j + p k ) > 0) and (C Pm − C NPm > p i − (p j + p k ); 5.
Exchange (two-two): exchange two jobs, i and j, from a (Pm) with two jobs, k and l, from a (NPm) if (p i + p j − (p k + p l ) > 0) and (C Pm − C NPm > p i + p j − (p k + p l ). To illustrate how the neighborhood structures work, consider six jobs to be scheduled in two identical parallel machines. The processing times in minutes of each job are shown in Table 1. Let us consider the initial solution with C max = 19 obtained by the schedule shown in Figure 2. From Equation (4), 29 2 , 10, 5 + 8 = 15. The problem machine (Pm) is Machine 1, while Machine 2 is the non-problem machine (NPm). It is worth noting that the selection of jobs for each neighborhood structure was selected arbitrarily in this example just to illustrate how these neighborhood structures work.
• Problem machine (Pm): a machine in which the total scheduling time is the makespan. • Non-problem machine (NPm): a machine in which the total scheduling time is less than the makespan.
The five neighborhood structures used in this work are explained as follows: 1. Move (one): move a job i from a problem machine (Pm) to a non-problem machine (NPm) if (makespan of Pm (CPm)-makespan of NPm (CPm) > processing time of job i (pi)); 2. Exchange (one-one): exchange a job i from a (Pm) with a job j from a (NPm) if (pi -pj > 0) and (CPm − CNPm > pi -pj); 3. Exchange (two-one): exchange two jobs, i and j, from a (Pm) with a job k from a (NPm) if (pi + pj − pk > 0) and (CPm − CNPm > pi + pj -pk); 4. Exchange (one-two): exchange a job from a (Pm) i with two jobs, j and k, from a (NPm) if (Pi − (pj + pk) > 0) and (CPm − CNPm > pi − (pj+ pk); 5. Exchange (two-two): exchange two jobs, i and j, from a (Pm) with two jobs, k and l, from a (NPm) if (pi + pj − (pk + pl) > 0) and (CPm − CNPm > pi + pj − (pk+ pl).
To illustrate how the neighborhood structures work, consider six jobs to be scheduled in two identical parallel machines. The processing times in minutes of each job are shown in Table 1.  It is worth noting that the selection of jobs for each neighborhood structure was selected arbitrarily in this example just to illustrate how these neighborhood structures work.  Since the condition (makespan of Pm (1)-makespan of NPm (2) > p 6 ) is satisfied, then applying the "Move (one)" neighborhood structure to move Job 6 from Pm (1) to NPm (2) yields C max = 18 as illustrated in Figure 3. Since the condition (makespan of Pm (1)-makespan of NPm (2) > p6) is satisfied, then applying the "Move (one)" neighborhood structure to move Job 6 from Pm (1) to NPm (2) yields Cmax = 18 as illustrated in Figure 3.  With the conditions (p4 + p2 − p1 > 0) and (CPm − CNPm > p4 + p2 -p1) being satisfied, the "Exchange (two-one)" neighborhood structure is applied by exchanging Jobs 4 and 2 from Pm (2) with Job 1 from NPm (1). The resulting schedule with Cmax = 16 is shown in Figure  5. Note that the "Exchange (one-two)" neighborhood structure is similar to the previous one, with the difference being that one job from Pm is exchanged with two jobs from NPm. Here, the conditions are not satisfied for the schedule in Figure 5.
Finally, the fifth neighborhood structure, "Exchange (two-two)", is applied by exchanging Jobs 1 and 6 from Pm (2) with Jobs 2 and 5 from NPm (1). The obtained schedule with Cmax = 15 is shown in Figure 6. Note that this is the optimal makespan, since it equals LB.  Now, the conditions (p 2 − p 5 > 0) and (C Pm − C NPm > p 2 − p 5 ) are satisfied. Applying the "Exchange (one-one)" neighborhood structure results in obtaining C max = 17 with the schedule illustrated in Figure 4. In this schedule, Machine 2 becomes the Pm machine and Machine 1 is the NPm machine. Since the condition (makespan of Pm (1)-makespan of NPm (2) > p6) is satisfied, then applying the "Move (one)" neighborhood structure to move Job 6 from Pm (1) to NPm (2) yields Cmax = 18 as illustrated in Figure 3.  With the conditions (p4 + p2 − p1 > 0) and (CPm − CNPm > p4 + p2 -p1) being satisfied, the "Exchange (two-one)" neighborhood structure is applied by exchanging Jobs 4 and 2 from Pm (2) with Job 1 from NPm (1). The resulting schedule with Cmax = 16 is shown in Figure  5. Note that the "Exchange (one-two)" neighborhood structure is similar to the previous one, with the difference being that one job from Pm is exchanged with two jobs from NPm. Here, the conditions are not satisfied for the schedule in Figure 5.
Finally, the fifth neighborhood structure, "Exchange (two-two)", is applied by exchanging Jobs 1 and 6 from Pm (2) with Jobs 2 and 5 from NPm (1). The obtained schedule with Cmax = 15 is shown in Figure 6. Note that this is the optimal makespan, since it equals LB.  With the conditions (p 4 + p 2 − p 1 > 0) and (C Pm − CN Pm > p 4 + p 2 -p 1 ) being satisfied, the "Exchange (two-one)" neighborhood structure is applied by exchanging Jobs 4 and 2 from Pm (2) with Job 1 from NPm (1). The resulting schedule with C max = 16 is shown in Figure 5. Since the condition (makespan of Pm (1)-makespan of NPm (2) > p6) is satisfied, then applying the "Move (one)" neighborhood structure to move Job 6 from Pm (1) to NPm (2) yields Cmax = 18 as illustrated in Figure 3.  With the conditions (p4 + p2 − p1 > 0) and (CPm − CNPm > p4 + p2 -p1) being satisfied, the "Exchange (two-one)" neighborhood structure is applied by exchanging Jobs 4 and 2 from Pm (2) with Job 1 from NPm (1). The resulting schedule with Cmax = 16 is shown in Figure  5. Figure 5. The schedule after applying the "Exchange (two-one)" neighborhood structure.
Note that the "Exchange (one-two)" neighborhood structure is similar to the previous one, with the difference being that one job from Pm is exchanged with two jobs from NPm. Here, the conditions are not satisfied for the schedule in Figure 5.
Finally, the fifth neighborhood structure, "Exchange (two-two)", is applied by exchanging Jobs 1 and 6 from Pm (2) with Jobs 2 and 5 from NPm (1). The obtained schedule with Cmax = 15 is shown in Figure 6. Note that this is the optimal makespan, since it equals Note that the "Exchange (one-two)" neighborhood structure is similar to the previous one, with the difference being that one job from Pm is exchanged with two jobs from NPm. Here, the conditions are not satisfied for the schedule in Figure 5.
Finally, the fifth neighborhood structure, "Exchange (two-two)", is applied by exchanging Jobs 1 and 6 from Pm (2) with Jobs 2 and 5 from NPm (1). The obtained schedule with C max = 15 is shown in Figure 6. Note that this is the optimal makespan, since it equals LB. With the conditions (p4 + p2 − p1 > 0) and (CPm − CNPm > p4 + p2 -p1) being satisfied, the "Exchange (two-one)" neighborhood structure is applied by exchanging Jobs 4 and 2 from Pm (2) with Job 1 from NPm (1). The resulting schedule with Cmax = 16 is shown in Figure  5. Note that the "Exchange (one-two)" neighborhood structure is similar to the previous one, with the difference being that one job from Pm is exchanged with two jobs from NPm. Here, the conditions are not satisfied for the schedule in Figure 5.
Finally, the fifth neighborhood structure, "Exchange (two-two)", is applied by exchanging Jobs 1 and 6 from Pm (2) with Jobs 2 and 5 from NPm (1). The obtained schedule with Cmax = 15 is shown in Figure 6. Note that this is the optimal makespan, since it equals LB. Figure 6. The optimal schedule after applying the "Exchange (two-two)" neighborhood structure.

Steps of Variable Neighborhood Search Algorithm
The VNS algorithm is detailed by Algorithm 1.  Figure 6. The optimal schedule after applying the "Exchange (two-two)" neighborhood structure.

Steps of Variable Neighborhood Search Algorithm
The VNS algorithm is detailed by Algorithm 1.

Algorithm 1: VNS algorithm.
Obtain an initial Solution. It is worth noting that the maximum number of iterations in the VNS algorithm is important only when generating the initial solution randomly. Therefore, at each iteration, a random schedule is generated and neighborhood structures are applied. The final solution is the best makespan among all iterations. On the other hand, obtaining the initial solution using a heuristic such as the LPT rule will have no effect as the number of iterations increases. Therefore, only one iteration is needed in the proposed VNS algorithm.

The Improved Arc Flow Model
In this section, we will show how to construct a graph for the proposed improved arc flow model, give a numerical example to present the impact of the proposed model on reducing the number of variables compared to the arc flow model proposed by Mrad and Souayah [29], and formulate the studied problem based on the improved arc flow model.

Graph Building
The arc flow model for the P m ||C max problem was constructed by finding h disjoint paths for each job on a graph initialized from the starting time at node 0 up to the ending time at node T, where: 0 ≤ T ≤ UB. Consider a graph G with vertices V and arcs A, G = (V, A),x where V = {0, . . . , UB} with the notation v representing a general index of each vertex and A to represent the combination between the arcs of jobs A n , where n = {1, . . . ,N} and the loss arc is A o (A = A n ∪ A o ). The loss arc is an arc connecting each vertex (except vertices 0 and UB) and the UB vertex. Each arc is defined by three terms: an initial node s, a final node d, and a job n representing this arc such that: where s and d represent, respectively, the initial and final nodes of the job arc A n .
The set of arcs for each job n is constructed as follows: a directed arc is created between two vertices s and d if there is a job n with process time P n = d − s, given that d ≤ UB and s < d. This set, i.e., A n , can be reduced by applying some breaking symmetry rules [29]: 1.
Jobs are sorted in decreasing order of their processing times.

2.
An arc of a job n can't be created from a node that is added from the arc of the same job.

3.
Arcs for each job only start from the created nodes of previously considered jobs, assuming that node 0 is created at the beginning of the graph-building process.
In addition to the above-listed breaking symmetry rules, a graph compression technique was proposed that prevents some arcs of jobs from being created from node 0. The idea behind this technique is based on the fact that, by knowing the makespan of some scheduled jobs on one machine, we can find the lower bound of the remaining jobs on the remaining (m − 1) machines, where m represents the total number of machines. If the resulting lower bound exceeds the upper bound, these jobs will not have arcs from node 0. Instead, the arcs will be created from the first node created after node 0. Another graph compression technique was used for reducing the number of loss arcs. A loss arc is only needed for the last job in each machine schedule so that the flow conservation is satisfied. Therefore, only a loss arc is created from a node v that satisfied (∑ n∈N P n − v)/(m − 1) ≤ UB. The graph compression algorithm for the arcs of jobs is illustrated in Algorithm 2. The selected number of jobs from Algorithm 2 was used to build the graph from node 0 while the remaining jobs were used to build the graph from the next node after node 0, as illustrated in Algorithm 3.

Numerical Example
Consider five jobs to be scheduled in two identical parallel machines. The processing times in minutes of each job are shown in Table 2. Let LB = 14 min using Equation (4) and UB = 15 min obtained by both the LPT and VNS algorithms. The arc flow graph built using the algorithm proposed by [29] gives 11 nodes, 15 job arcs, and 9 loss arcs. Algorithm 2 can then be applied to check whether any job arcs can be reduced. Starting from the job with the least processing time, i.e., Job 5, the new lower bound is checked when one machine has only Job 5.  [29], is shown in Figure 7. The resulting improved arc flow network with seven nodes, nine job arcs, and one loss arc is shown in Figure 8. The optimal solution is shown in Figure 9.
The graph compression for loss arcs, when applied, yields only one loss arc flow from The arc flow network, as proposed in [29], is shown in Figure 7. The resulting improved arc flow network with seven nodes, nine job arcs, and one loss arc is shown in Figure 8. The optimal solution is shown in Figure 9.  The arc flow network, as proposed in [29], is shown in Figure 7. The resulting improved arc flow network with seven nodes, nine job arcs, and one loss arc is shown in Figure 8. The optimal solution is shown in Figure 9. Constraints: The objective in Equation (5) minimizes the makespan. The makespan is the value of Constraint (6) that corresponds to the maximum final node of the job arc selected, given that it is greater than or equal to the lower bound. Constraint (7) means that the total number of arcs leaving node 0 must be equal to the number of machines, since each connected arc in the final solution represents a schedule of the jobs in one machine. The flow conservation constraints are presented in Equation (8) to ensure that for the intermediate nodes, if an arc enters a node, it must leave that node. Constraint (9) ensures one arc for each job. Constraints (10) and (11) are binary and integer decision variables, respectively. The binary decision variables are for the job arcs, while the integer decision variables are for the loss arcs and are limited to m. Finally, Constraint (12) illustrates the bounds of the makespan.

Computational Results and Discussion
The proposed improved arc flow model was coded in C++ language and the mathematical model was solved using Cplex 12.10. The proposed model was run on an Intel ® Core i7-4930 k CPU @3.40 GHz and 34.0 GB of RAM. The improved arc flow mode was compared with the arc flow model proposed by Mrad and Souayah [29]. For a fair comparison, Mrad and Souayah's algorithm [29] was run on the same PC. Both algorithms used the same LPT rule. However, the algorithm proposed by Mrad and Souayah [29] used LPT as a UB for their arc flow model. On the other hand, the LPT rule in the proposed improved arc flow model was used as an initial solution for the VNS algorithm. All algorithms had a time limit per instance = 1200 s. We will first mention the benchmark instances used in this paper, and then present the computational results.

Benchmark Instances
In this paper, we considered the benchmark instances proposed by Mrad and Souayah [29] for the P||C max problem. The benchmark instances were set by first considering the type of instances based on the ratio n/m, which has the following values: n/m ∈ {2, 2.25, 2.5, 2.75, and 3}. Then, for each ratio, the processing times of the jobs were generated based on seven types of classes. Class 1, Class 2, and Class 3 were generated using a discrete uniform distribution with the intervals [1100], [2100], and [50,100], respectively. Class 4 and Class 5 both were generated using a uniform distribution with a mean µ = 100, whereas the standard deviations were σ = 20 for Class 4 and σ = 50 for Class 5. Class 6 was generated using a discrete uniform distribution in the interval [n, 4n] and Class 7 was generated using a normal distribution with µ = 4n and σ = n. In each type of ratio, each class had 10 combinations of n jobs and m machines. Each combination had 10 instances. Therefore, the total number of generated instances was 3500 instances. The combination of n and m is shown along with the results in Tables 3 and 4.

Comparison of the Arc Flow Model and the Improved Arc Flow Model
In this section, a comparison between the arc flow model developed by Mrad and Souayah [29] and the proposed improved arc flow model was conducted. The results of the mean CPU times in seconds for both models are shown in two tables. Table 3 presents a comparison of the two models for Classes 1-4, whereas Table 4 presents a comparison for Classes 5-7.
The results in Table 3 show that for n/m = 2, the arc flow (AF) model obtained the optimal solution in less than one second for almost all of the instances (except the last two (n, m) combinations of Class 1. On the other hand, the improved arc flow (IAF) model optimally solved all these instances in less than 0.40 s. The results suggest that instances with a ratio of n/m = 2 seem to be the easiest ones. Indeed, for n/m = 2.25, the mean CPU time for the arc flow model gradually increased, especially for some classes and some combinations of n and m. For instance, the mean CPU time reached 20.1808 s for Class 4 with n = 198 and m = 88. In contrast, the improved arc flow model still found the optimal solution for all classes in less than one second. Instances with ratios of 2.5 and 2.75 seem to be the hardest ones, where the maximum recorded CPU times are observed for both the arc flow model (40.1836 s) and the improved arc flow model (3.00184 s). Overall, the results in Table 3 show that, although both models obtained the optimal solution within the time limit, the improved arc flow model clearly outperformed the arc flow model in terms of the required CPU time.
From Table 4, we observed that Class 5 can be considered the simplest class compared to Classes 6 and 7. Indeed, all instances of this class were solved to optimality in a relatively small time for both models. The maximum CPU time was 14.5461 s and 6.14091 s for the arc flow model and the improved arc flow model, respectively. Classes 6 and 7 showed a dramatic outperformance of our proposed approach with respect to the arc flow model. Indeed, the improved arc flow (IAF) model was, on average, 5.68 and 12.70 times faster than the arc flow (AF) one on Classes 6 and 7, respectively. This ratio could reach as much as 60.53 times for Class 7 with n = 126 and m = 56. Moreover, we observed that, for two combinations of n and m (n = 198, m = 72 and n = 220, m = 80) in Class 7, the arc flow model (AF) reached the maximum time limit of 1200 s without finding the optimal solution for any of the considered 20 instances. On the other hand, the proposed improved arc flow model (IAF) was able to solve 19 of these instances to optimality within an average CPU time of 252.22 s and 377.1867 s, respectively.
To present overall insight into the performance of the proposed improved arc flow model, Table 5 shows the total time of the arc flow model [29], the total time of the proposed improved arc flow algorithm, and the percentage of improvement (% improvement) in the total mean CPU time for each type of ratio (n/m). The % improvement calculated how much savings in time were obtained by the proposed improved arc flow model compared with the arc flow model [29] and was calculated as follows: A positive % improvement indicated that the proposed improved arc flow model obtained a lower total CPU time compared with the arc flow model [29].
The results in Table 5 show that the IAF algorithm considerably reduced the total time for all types of ratios. The percentage of time reduction ranged from 67.16% up to 93.09% with the overall percentage of reduction in time reaching 87.44%. Figure 10 shows a visual representation of the total CPU times of both algorithms.
To know how many instances out of 10 were not solved to optimality for both algorithms, Table 6 illustrates the number of unsolved instances for the combinations of (n, m) in Classes 6 and 7, i.e., instances where the solver could not reach the optimum solution within the time limit (1200 s). The instances of the other (n, m) combinations for Classes 6 and 7 are not shown in Table 6 since an optimal solution was found for both models.
The results show that Class 6 and Class 7 with the ratios (n/m) = 2.5, 2.75, and 3 were the hardest instances for the AF model [29]. The total number of unsolved instances for Class 6 and Class 7 in all the ratios was 19 and 58 instances, respectively. Therefore, a total of 77 instances out of 3500 instances were not solved to optimality by the AF model [29], with a percentage of 97.80% solved instances. On the other hand, only one instance in Class 7 of the ratio (n/m) = 2.75 with a combination (n = 220, m = 80) was not solved to optimality by the proposed improved arc flow model. The percentage of solved instances was 99.97% with an increase of 2.17% in the solved instances compared with the AF model [29]. To know how many instances out of 10 were not solved to optimality for both algorithms, Table 6 illustrates the number of unsolved instances for the combinations of (n, m) in Classes 6 and 7, i.e., instances where the solver could not reach the optimum solution within the time limit (1200 s). The instances of the other (n,m) combinations for Classes 6 and 7 are not shown in Table 6 since an optimal solution was found for both models.