1. Introduction
The one-machine scheduling problem that we consider in this paper can be formulated as follows: n jobs have to be scheduled on a single machine. Job j becomes available at its release time , can be scheduled on the machine at time moment or later, and needs to be processed continuously during time units by the machine. The machine can handle at most one job at a time. If job j starts at time on the machine, then its completion time is . The due-date of job j is the time moment at which the completion of that job is desired. A feasible schedule S is a mapping that assigns to each job j a starting time on the machine, such that and , for any job i assigned to the machine before time in schedule S. The first inequality represents the restriction that a job cannot be started before its release time, and the second one represents the resource restriction that the machine can handle only one job at any time moment. The lateness of job j in schedule S is . The objective is to find a feasible schedule that minimizes the maximum job lateness .
The problem has an equivalent setting in which job due-dates are replaced by job delivery times. In this setting, every j completed on the machine requires an additional (constant) delivery time for its full completion (the finished orders need to be delivered by an independent agent, which needs no machine time). Thus, the full completion time of job j is . A feasible schedule is defined as for the first setting. The objective is to find an optimal schedule, one minimizing the maximum full job completion time (the so-called makespan).
Note that the smaller the due-date of a job, the more urgent the job is; equivalently, the larger the delivery time of a job, the more urgent it is. The equivalence between the two settings is established by interchanging the roles of job delivery times and job due-dates. Given the setting with delivery times, for every job
j, take a suitably large constant
K (any magnitude no less than the maximum job delivery time) and define due-date of that job
, and vice-versa (Bratley et al. [
1]).
Since both settings describe the same problem, we will refer to them interchangeably. According to the Graham’s three-field notation, the settings with job due-dates and delivery times are abbreviated as
and
, respectively (in the first, the second and the third fields, respectively, the single-machine environment, the job parameters, and the objective criteria, respectively are specified). The problem is well-known to be strongly NP-hard Garey and Johnson [
2], and it remains (weakly) NP-hard with only two allowable job release times [
3].
Overview of some related work. We first give a short overview of some work related to our scheduling problem and then we mention about recent work on some related single-machine scheduling problems.
The first exponential-time implicit enumeration (branch-and-bound) algorithms for problem
were proposed in late 70s and a few more implicit enumeration algorithms were suggested in 80s (see, for example, McMahon and Florian [
4], Carlier [
5] and Grabowski et al [
6]).
As to basic polynomially solvable special cases, remarkably, if all jobs are released simultaneously or all jobs have the same delivery time (or the same due-date), the greedy algorithm proposed by Jackson in the late 50s [
7] finds an optimal solution. Jackson’s greedy algorithm was adopted for the general case
by Schrage [
8]. The extended Jackson’s algorithm, iteratively, determines the current scheduling time, which is either job release time of job completion time, and among the jobs which are released by this time schedules a job with the smallest due-date (or the largest delivery time). For further references, we abbreviate the heuristics as ED (Earliest Due-date) and LDT (Largest Delivery Time) heuristics, respectively.
For the preemptive version of our problem,
(any running job can be interrupted in favor of another job and its execution can later be resumed), Jackson’s extended heuristic is optimal. Hence, the non-preemptive version of this problem is essentially more complex than the preemptive one. With non-simultaneously released jobs, if the processing time of all jobs is the same, the (non-preemptive) problem is essentially harder to solve than the version without job release times, but it still admits polynomial-time solution Garey et al. [
9], and remains polynomially solvable with only two allowable job processing times [
10]. If the set of job processing times consists of mutually divisible numbers (e.g., powers of 2), then the problem
can still be solved in polynomial time [
11] (apparently, this is a most general setting with non-arbitrary job processing times that can be solved in polynomial time, since the setting in which job processing times are from the set
, for any integer
p, is strongly
-hard [
12]).
In general setting, since our scheduling problem
is NP-hard, a reasonable alternative to an implicit enumeration of the feasible solutions is an approximation solution method. A
-approximation algorithm obtains a solution with the objective value at most
times the optimal objective value, for any instance of a given problem (
is commonly referred to as the
approximation ratio). Jackson’s extended heuristic gives a 2-approximation solution for the general setting
. Approximation algorithms with approximation ratios
and
respectively, were suggested by Potts [
13] and Hall and Shmoys [
14] (both are based on the Jackson’s greedy algorithm). To the best of our knowledge, there is no polynomial-time algorithm with an approximation ratio better than
.
We refer the reader to Lawler et al. [
15] for an extensive overview on machine scheduling problems. Below we address a few recent contributions on some related one-machine scheduling problems in which job parameters or/and objective functions are different from those in our problem.
As to different objectives, in some applications it is desirable to complete the jobs “just-in-time”, i.e., it is undesirable to complete them not only far behind but also far before their due-dates (for instance, because of the warehousing and transportation issues). A job is said to be
early if it is completed before its due-date, and is
tardy if it is completed after its due-date. Clearly, the early completion of all jobs favors the minimization of the maximum lateness (note that in a feasible schedule the lateness of some or even of all jobs can be negative). The
tardiness of a job is defined similarly as its lateness with the difference that it cannot be negative, i.e., if the lateness of
j is negative then its tardiness is set to 0. There are possible different objective functions that favor neither early nor tardy jobs. One such a function is the sum of the maximum job earliness and the maximum job tardiness. With non-equal job release times and due-dates, the setting with this objective function is known to be strongly NP-hard. Mahnam et al. [
16] proposed a branch-and-bound method for the problem. Yazdani et al. [
17] consider a model with the objective function without job release times (i.e., all jobs are simultaneously released) but with an additional restriction that there are multiple machine unavailability periods, time intervals in which no job can be scheduled on the machine (such a restriction might be motivated by the machine maintenance, repair, etc.). The authors present a variable neighborhood search meta-heuristic for this model (which is also known to be NP-hard).
In some industries the production process is not heterogeneous with respect to the set of the jobs, i.e., different jobs have different priorities. For such environments, the (positive) weights for each job are introduced (the jobs with larger weight have higher priority). An overview on the weighted single-machine scheduling problems with the objective to minimize the weighted number of the tardy jobs can be found in Adamu and Adewumi [
18].
Scheduling models with learning and deterioration effects were recently considered. Roughly, by the deterioration effect, the processing time of a job depends on its starting time—the later it starts, the more processor time it needs. This might be caused by the fact that the machine may deteriorate during a long period of the load time or due to the fact that the raw materials that need to be processed deteriorate over time. By the learning effect, the job processing time depends also on the position of this job among the other jobs—the higher this position is, the smaller the amount of time that this order will need to complete: the workers and the machines can improve the production efficiency with more processing experiences. These effects are reflected in job processing times with the help of some auxiliary parameters. In the papers listed below all jobs are simultaneously released.
Yin and Xu [
19] consider one-machine scheduling problems with learning and deterioration effects with the objective to minimize the makespan, sum of the
ith power of completion times of jobs, total lateness and sum of earliness penalties. They show that the proposed models are polynomially solvable. The authors also give some conditions under which the settings with the objective to minimize the total weighted completion time, total weighted completion time, maximum lateness, maximum tardiness, total tardiness and total weighted earliness penalties are polynomially solvable. Lu et al. [
20] study other related one-machine scheduling problems with learning effects. They consider settings with different objective functions including again the makespan and the total completion time of all jobs and propose polynomial-time solution methods. Hou et al. [
21] propose another one-machine scheduling model with a single machine maintenance period, after which job processing times change according to a given deterioration formula (the machine deteriorates over time and it can be fully or partiality restored after the maintenance period). Recently Cheng et al. [
22] consider other single-machine scheduling models with multiple machine maintenance periods. The jobs between the maintenance periods are partitioned into the batches. Each maintenance period depends on the total processing time of the jobs from the batch processed before that period, and the job processing times are deteriorated according to given formulas. Polynomial time algorithms are proposed for the objectives to minimize the makespan and the total job completion time. Park and Choi [
23] consider a more complex situation with a single in-house machine and the outsourcing costs. If a job is not scheduled on the (in-house) machine then it causes the outsourcing cost. The authors consider uncertainty setting in which the job processing times and the outsourcing costs are dependent on a given scenario. The objective is to minimize a specially defined weighted sum of the completion times of the jobs assigned to the in-house machine plus all the outsourcing costs. Since the considered scheduling problems are NP-hard even in the deterministic settings, the authors present some optimality conditions yielding polynomial time solutions.
Our contribution. Due to the complexity status of our scheduling problem, implicit enumeration algorithms cannot guarantee the optimal solution for “large enough” problem instances as their worst-case running time is exponential in the length of the input. Although the algorithm that we propose here is based on the enumeration of the feasible solutions, it uses an approximation condition that provides a certain approximation factor for each created feasible solution in the search tree. Regarded as an exact solution method, our algorithm carries out implicit enumeration of complete feasible solutions in a search tree, in which each node represents a complete feasible solution. The reduction of the search space is accomplished on the basis of the established pruning and halting conditions. Based on these conditions, unpromising feasible solutions are discarded in the search tree. At the same time, our algorithm can be regarded as an approximation algorithm since it calculates an approximation factor for each enumerated feasible solution and hence, in practice, the algorithm can be stopped if a desired approximation is already guaranteed in the next created feasible solution. The approximation factor of each solution from a branch of the solution tree is calculated by constructing that solution according to specific rules and then by analyzing specific structural properties of the earlier created solutions in that branch. In particular, the approximation factor for a created solution is guaranteed if there are detected jobs with a specific property in the branch of the search tree to which that solution belongs (typically, the larger the length of the path from the root to the node representing a solution in the search tree, the larger the value of ). In this way, a complete enumeration can be avoided whenever a desired approximation factor is already attained in the next generated feasible solution. Since the number of feasible solutions grows exponentially with the length of the input, the worst-case time and space complexities of our basic enumeration scheme are exponential. Although we cannot give a reasonable theoretical estimation for the parameter , in practice, it becomes large enough within a short running time of our implicit enumeration algorithm.
To verify the practical performance of our algorithm and the effectiveness of the approximation factor calculated for each generated solution, we have carried out extensive computational experiments. As it turned out, the basic enumeration scheme gives an optimal solution for a moderate problem instances. At the same time, since our main goal is to find an approximation solution in a short period of time, we were mainly interested in the guaranteed approximation factor of the best generated solution within a short interval of time. As the experimental results have shown, the approximation factor that our algorithm attains within a small execution time is, on average, better than the approximation factors provided by the earlier known approximation algorithms (these are traditional heuristic algorithms in the sense that they are not based on the enumeration of the feasible solution set).
We have created and tested over thousand problem instances randomly, 50 problem instances with different number of jobs. We stress that a very considerable number of problem instances were solved optimally already by the standard earlier known dominance rules (in particular, ones used in the earlier mentioned enumeration algorithms). These instances (which formed over 80% of all the generated instances) are not present in the experimental data that we report here. The vast majority of the tested “difficult” instances up to 50 jobs were solved optimally almost instantaneously, whereas the guaranteed average approximation factor from 100 to 600 jobs was about 1.3 and from 700 to 1000 jobs it was about 1.2. About 10% of larger problem instances with 2000 and 5000 jobs were solved optimally within one minute of the running time of our algorithm and the average approximation factor was about 1.3. Remarkably, the average approximation factor for the largest tested instances with 10,000 has decreased to about 1.25.
To test the behavior of our enumeration framework in the worst possible scenarios, we have generated pseudo-random artificial problem instances as well. This second class of instances were created as inconvenient ones for our algorithm so that it would be forced to perform an almost complete enumeration of the candidate ED-schedules. Our aim was to verify the approximation that our algorithm would still guarantee in a reasonable time. As intended, our algorithm has failed to create an optimal solution for already moderately sized artificial instances. However, extremely good approximation factors very close to 1 for these instances were attained.
In the following
Section 2, we give preliminaries including some earlier known definitions and properties. In
Section 3 we describe the basic framework of our implicit enumeration algorithm. In
Section 4 we give our optimality, halting and approximability conditions that we incorporate into the basic framework. We analyze our experimental study in
Section 5 and we give our final remarks in
Section 6.
2. Preliminaries
This section contains the earlier known preliminary notions which are used here (see, e.g., [
10,
24]). First, we give a more detailed description of ED-heuristic (and LDT-heuristic, respectively), a tool for the generation of our schedules. The algorithm works on
n scheduling times, at which a job is assigned to the machine. At the first iteration the initial scheduling time is defined as the minimum job release time, and a most urgent job, one with the minimum due-date (the maximum delivery time, respectively) is scheduled on the machine (ties can be broken arbitrarily). Iteratively, the current scheduling time is determined as the maximum between the completion time of the latest scheduled job and the release time of an earliest released yet unscheduled job (recall that no job can be started before its release time and no job can be started while the machine remains busy). Again, among all jobs released by this scheduling time, a job with the minimum due-date (the maximum delivery time, respectively) is scheduled on the machine. The heuristic may create no avoidable gap, since whenever the machine becomes idle and there is a released job, it schedules it on the machine. At the same time, among the yet unscheduled jobs released by each scheduling time it gives priority to the most
urgent job (one with the minimum due-date or the maximum delivery time).
The two above heuristics give equivalent results for the two equivalent scheduling problems. We shall refer to a schedule created by ED-heuristic (LDT-heuristic, respectively) an ED (LDT, respectively) schedule. If we apply the heuristics to the originally given problem instance, we obtain the initial ED (LDT) schedule that we denote by . As we will see a bit later, by slightly modifying the original problem instance, alternative ED (LDT) schedules with desired characteristics can be created.
We will refer to a maximal consecutive time interval in a schedule within which the machine is idle as a gap. For convenience, we will assume that there occurs a 0-length gap if job i starts at time immediately after the completion time of job j.
An ED-schedule S can naturally be divided into independent portions that will be referred to as blocks: A block is a consecutive part in schedule S consisting of the successively scheduled jobs without any gap in between any two neighboring jobs; a block is preceded and succeeded by a (possibly 0-length) gap.
The kernels. The whole set of jobs can be partitioned, roughly, into two kinds of jobs: non-critical and critical ones. Intuitively, the non-critical jobs are flexible in the sense that they might be moved within a feasible schedule without affecting the objective value, unlike the critical jobs which attain the maximum value of the objective function. Below we define our critical jobs formally.
Consider a maximal consecutive job sequence
K in a block of an ED-schedule
S ending with say, job
o, such that
and no job from this sequence has a due-date more than
, i.e.,
for all
. In terms of the delivery times, for the equivalent setting we have
and
for all
(so any job of the sequence is no less urgent than job
o).
We call such sequence
K in ED (LDT) schedule
S a
kernel, and we call job
o the corresponding
overflow job (abusing the notation, we use
K also for the corresponding job-set). We let
We stress that there may exist no gap within a kernel.
If a kernel K is immediately preceded by a gap, then it starts a new block; otherwise, it is immediately preceded and delayed by job l with (, respectively). In general, we may have more than one job e with () scheduled before kernel K within the block containing that kernel (job e is pushing the jobs of kernel K in the sense that the removal of that job would restart the first job of kernel K earlier).
We call such a job e an emerging job, and job l above the delaying emerging job for the kernel K in schedule S.
As we will see in detail later, by rescheduling an emerging job to a later time moment and restarting kernel jobs earlier the current maximum objective value can be reduced in a newly created ED (LDT) schedule. The following halting condition is from [
11] (see Proposition 1 in [
11]):
Proposition 1. The initial ED (LDT) schedule σ is optimal if it contains a kernel K with its earliest scheduled job starting at time , equivalently, if there exists no (delaying) emerging job for that kernel.
If the above condition does not hold, our enumeration procedure is initiated by creating alternative branches, one for each of the emerging jobs. Before we describe our branching scheme in more detail, we give some basic properties of the ED schedules that we enumerate.
First, we observe that an ED schedule S may contain more than one kernel, and that the overflow job o of a kernel K is succeeded by a gap or it is succeeded by a job j with (if S is an LDT schedule, then (hence and ). We will denote the earliest arisen kernel in schedule S by and will refer to it as the kernel in that schedule; respectively, we will refer to the corresponding overflow job as the overflow job of schedule S.
Given an ED-schedule
S with the delaying emerging job
for the kernel
, we denote by
the delay of kernel
in schedule
S, which is a forced right-shift imposed by the delaying job
l for the jobs of kernel
K. The following known fact easily follows from Proposition 1 and an easily seen observation that no job of kernel
K is released by the time when job
l is started in schedule
S, as otherwise ED (LDT) heuristic would have included the former job instead of job
l:
Property 1. .
Let be the objective value of an ED (LDT) schedule S and let be that of an optimal schedule . Property 1 easily implies the next well-known corollary.
Corollary 1. .
Given a (non-optimal) ED-schedule S, now we specify how an alternative ED (LDT) schedule can be created from that schedule. We will say that an emerging job e is activated for the kernel in schedule S if it is rescheduled after that kernel. We activate job e so that the resultant schedule , called a complementary (to S) schedule, is also an LDT (ED) schedule. We do this by merely increasing artificially the release time of job e to the maximum release time of a job in kernel so that the heuristic, once applied to the modified (in this way) problem instance, will reschedule job e after all the jobs of kernel . Indeed, since job e becomes released no earlier than any job of kernel , the heuristic will include any kernel job before job e.
The jobs of kernel can be left-shifted in the complementary schedule , i.e., they may be restarted earlier than in schedule S. In particular, this will be the case if no new emerging job (one, included after kernel in schedule S) gets included before kernel in schedule ; otherwise, a newly included emerging job may similarly be activated.
Example 1. In Table 1 and Table 2 we give a small randomly generated problem instance with 10 jobs, one of the instances that we have tested during our experiments. The initial ED-schedule σ is represented in Table 2 and Figure 1, in the form of a table and graphically, respectively. It is easy to see that kernel consists of jobs 3 and 8 (8 is the overflow job), jobs 0 and 9 are emerging jobs. Table 3 represents a modified problem instance, in which the release time of the delaying emerging job 9 is artificially increased. The result of the application of ED-heuristics to that modified instance is a complementary schedule represented in Table 4 and Figure 2. The optimal schedule for the instance is depicted in Figure 3. The kernel of that schedule consists of a single job 5, which is also the overflow job; there exists no emerging job in that schedule. 3. Basic Enumeration Scheme
In this section we describe how we enumerate the feasible solutions. Our basic branching scheme that relies on Propositions 1 and 2 is similar to that used in [
4,
5]. In the next section we complete our enumeration framework with further pruning and halting conditions straightforwardly incorporated into this section’s basic scheme.
We associate with every node h in our search tree a complete ED schedule . For simplicity, we will refer to node and stage h interchangeably, and denote by the enumeration tree generated by stage h; initially, (where 0 is the root). A closed node in tree is one without successors which may not have any successor, whereas an open node has no successor but may have it.
At stage h we determine kernel , the overflow job from that kernel and the set of the emerging jobs in schedule in time . By the definition, a kernel may contain no emerging job, but the block containing kernel K may include one or more emerging jobs for that kernel, where the latest scheduled one is the delaying emerging job l (the one that pushes the earliest scheduled job of kernel K). It is a known fact that if an ED (LDT) schedule S is not optimal, then in an optimal schedule at least one of the emerging jobs from that block is scheduled after kernel K; in general, the objective value in schedule cannot be reduced in any descendant of node h if , otherwise it suffices to generate complementary schedule , for each emerging job e:
Proposition 2. If either or the first scheduled job of kernel starts at its release time, then node h can be closed. Otherwise, let , , be the emerging jobs in set . Then it suffices to create k immediate successors of node h, .
Proof. First, it is easy to see that if the equality holds, then either kernel starts schedule or the earliest scheduled job of that kernel starts at its release time. Our claim is obvious for the former case. Assume that the first scheduled job of kernel starts at its release time. It is easy to see that no rearrangement of the jobs of that kernel may decrease the objective value of the overflow job . At the same time, since the first scheduled job of kernel starts at its release time, no job rearrangement involving the jobs scheduled before kernel may restart the earliest scheduled job of kernel earlier than it is scheduled in schedule . It follows that the full completion time of job cannot be decreased by any job rearrangement and hence node h can be closed.
In case , let , , be the enumeration of the emerging jobs in set in the order, reverse as they appear in schedule (so is the delaying emerging job). The k immediate successors of node h are created (for every emerging job in the set in this order, the ith successor representing the complementary schedule ).□
A formal description of our enumeration procedure that creates a complete search tree
is in Algorithm 1.
Algorithm 1: PROCEDURE Enumerate() {Generates tree } |
Initial settings ; IF there is no emerging job in schedule THEN stop { by Proposition 1 } { Iterative step } IF THEN close node h; { by Proposition 2 } IF there is an open node in tree THEN { backtrack } the (leftmost) closest open node and repeat Iterative step ELSE stop ELSE { } { construct complementary schedule for each emerging job } create successors of node h, , for
|
4. Optimality and Approximability Conditions
While enumerating different complementary schedules in tree
, an interaction of kernel
of the current complementary schedule
with the kernel of some earlier generated complementary schedule(s) may occur. In particular, the former kernel may coincide with the latter one, or the jobs from two or more earlier detected kernels may be joined into a newly formed kernel
. A detection and a proper analysis of such an interaction gives more insight into the problem and is also beneficial for the reduction of the search space. Below we describe briefly kernel interactions and give a few relevant definitions (the reader is referred to [
11] for a more detailed presentation).
Recall that given that the kernel of schedule
possesses an emerging job, such an emerging job
e is activated in the complementary schedule
resulting in a new complementary schedule
, an immediate successor of schedule
. In schedule
the processing order of jobs of kernel
may coincide or may not coincide with the processing order of these jobs in schedule
. If the processing orders are different, then it is the case that a job
becomes rescheduled earlier in schedule
compared to its (former) position in schedule
and it becomes an emerging job immediately in schedule
or in a descendant
of schedule
for the kernel
. Then PROCEDURE
Enumerate () will similarly generate another complementary schedule
, an immediate successor of schedule
in solution tree
. Similar scenario can be repeated as long as, as a result of the order change in the next created complementary schedule, a former kernel job becomes the delaying emerging job for the newly arisen kernel (a sub-kernel of kernel
), i.e., kernel
collapses; kernel
fully collapses in a complementary schedule
(a descendant of schedules
and
) if the last activation of a job of kernel
in a predecessor of schedule
yielded no further order change whereas all jobs of kernel
are ones from kernel
and the latter kernel possesses no delaying emerging job (we refer the reader to
Section 4 of [
11] for complete formal definitions and more details). Thus in each above newly created complementary schedule except schedule
a new emerging job, a former kernel job is activated (note that since during the collapsing of kernel
at most
emerging jobs from kernel
may arise, the total number of the created complementary schedules in which that kernel collapses is bounded by
).
Example 2. We illustrate the kernel collapsing using a small instance of Table 5 and Table 6. It is our third randomly generated problem instance with 10 jobs that we abbreviate by N_3_10. Figure 4 and Figure 5 illustrate the initial ED-schedule σ and ED-schedule . As we can observe of Table 7 and Table 8, in schedule kernel , consisting of jobs 0 and 6, is collapsed. So far, we have basically relied on earlier known facts in the branching and the pruning rules used in the enumeration procedure of
Section 3. In this section we give a few additional properties which are beneficial for the further reduction of the size of tree
.
Lemma 1. Let g be a successor-node of a node h in the tree . If the kernel does not collapse in schedule and kernels and have a job in common, then . Furthermore, if the first job of the kernel starts at its release time in schedule , then that schedule is optimal.
Proof. Since kernel does not collapse in schedule the order in both sequences and in both schedules and is the same. Furthermore, since kernel contains a job of kernel , the last job in both kernels is the overflow job in both schedules and therefore . The second claim follows from Proposition 1. □
Kernels and are said to be independent if , i.e., the two kernels have no job in common (h and g being defined as above).
Suppose kernels and are not independent and . Then it is easy to see that if all jobs of kernel also belong to kernel , then kernel is obtained as a result of the collapsing of kernel . Otherwise (all jobs of kernel belong to kernel ), kernel is said to be an extension of kernel .
Lemma 2. If kernel is an extension of kernel , then kernel includes the emerging job(s) activated for the kernel in the corresponding branch of tree .
Proof. Note that at least one emerging job e is activated for the kernel in tree . Since kernels and belong to the same block (otherwise they would have been independent), job e belongs also to this block together with the jobs of both kernels. Moreover, job e is not an emerging job for the kernel , as otherwise the two kernels would be independent. Since job e is not an emerging job for the kernel and belongs to the same block as the latter kernel, it forms part of it. □
Now we present our approximability condition that guarantees a certain approximation factor for each complementary schedule created in tree .
Theorem 1. Suppose a branch of tree contains κ complementary schedules possessing κ different independent kernels with the κ different delaying emerging jobs. Then the κth complementary schedule is an -approximation one.
Proof. Let be the stages with the independent kernels and the corresponding distinct delaying emerging jobs . Let be the corresponding LDT-schedules. By Corollary 1, , for . Since these are distinct jobs, . Furthermore, for the purpose of this estimation, we may assume that all delaying emerging jobs that have arisen have the same processing time, as otherwise the minimum of will be achieved at the earliest stage with the shortest (notice that for the estimation of our approximation, every is a valid expression).
Thus
and the delay of none of the kernels is more than
. Then
□
Theorem 1 gives a guaranteed approximation factor for each next complementary schedule created in tree
. Of course, our algorithm can be stopped at any moment when a desired approximation is attained in the next enumerated schedule. In fact, within a short running time of the procedure parameter
becomes large enough to guarantee a good approximation (see
Section 5).
Theorem 1 will not guarantee a desired approximation for complementary schedule if the branch of tree ending with node h does not contain “enough” complementary schedules with independent kernels; i.e., this branch contains the complementary schedules in which either the kernel of a predecessor schedule was extended or it is collapsed. These cases are dealt with in the following two lemmas.
Lemma 3. Suppose that in a complementary schedule kernel fully collapses and the overflow job in kernel is a job from kernel (where h is a predecessor node of node h in tree ). Then schedule is optimal.
Proof. The full completion time of any job from a fully collapsed kernel is a lower bound on the optimal schedule makespan (see again Section 4 of [
11]). Then the full completion time of the last job of kernel
is a lower bound on the optimum since this job is the overflow job in schedule
. Hence, schedule
is optimal. □
Lemma 4. Let kernel be an extension of kernel . Then an emerging job for the kernel in schedule , if any, is a (former) emerging job for the kernel and is included before the jobs of that kernel in schedule . If there exists no such emerging job, then node g can be closed.
Proof. Since kernel is an extension of kernel , there may exist no emerging job for the kernel included after the jobs of kernel in schedule (observe that stage g is a successor of stage h in the corresponding branch of solution tree ). This shows the first claim. As to the second claim, suppose there exists no emerging job for the kernel . In particular, the delaying emerging job l of kernel in schedule is either in the state of activation in schedule or/and it is not an emerging job for the kernel . In the first case, there occurs a gap before the first job of kernel in schedule , and hence the activation of no job included before that gap may left-shift any job scheduled after this gap in schedule . In particular, the completion time of the overflow job is the minimal possible for the jobs of kernel and the current branch of computation can be abandoned. In the second case above, as a result of the activation of emerging job(s) scheduled before job l, either there again arises a gap (now before the delaying emerging job l), or there is no such gap. In the former case we similarly apply the above reasoning. In the latter case, any emerging job activated for the kernel belongs to kernel , none of them being an emerging job for the kernel . It follows that there may exist no job whose activation may potentially decrease the current completion time of the jobs from kernel and the second claim in the lemma is proved. □
5. Discussion and Experimental Results
We have implemented our algorithm in C++ using Visual Studio 2012 on a personal computer with 16 GB, Windows 10 operating system and a 3.4 GHZ Intel i64470 processor (the code together with the generated problem instances can be found at [
25]). We have generated our instances randomly applying standard rules commonly used for scheduling problems. In each generated instance with
n jobs, the release time
, the processing time
and the due-date
(the delivery time
) of job
i were obtained as follows:
,
and
, where we let
n.
As we have mentioned earlier, in our statistics, we have omitted the instances which were solved optimally already by a standard optimality condition from Proposition 1 (these instances formed the majority of the generated instances). As to the publicly available instances for the problem, we have only found ones from an earlier mentioned reference [
16]. Although the input (i.e., the job parameters) from that reference coincide with those for our problem, the objective function dealt with in this reference is different from ours. Hence, our results cannot be directly compared to ones reported in [
16]. However, we have tested our algorithm for a considerable number of the largest instances (ones with 1000 jobs) from [
16], in particular, ones that were generated with parameters (ALPHA, BETA) = (0.25, 0.1), (0.25, 0.25), (0.25, 0.5), sixty instances in total. All these instances were solved optimally instantaneously at the first stage of our algorithm by ED-heuristic.
We have tested 50 difficult problem instances (non-discarded by Proposition 1 ones) with 10, 20, 50, 100, 200, 300, 400, 500, 600, 700, 800, 900 1000, 2000, 5000 and 10,000 jobs. For large instances, we have imposed an upper limit on the running time of our procedure (for 1000 and 2000 jobs 10, 30 and 60 s, and for 5000 and 10,000 jobs 300, 900 and 3600 s, respectively) since our goal was to verify the approximation guaranteed by Theorem 1 within a short execution time of the procedure. Moderate-sized instances were solved optimally. In particular, most of the instances with up to 50 jobs were solved optimally. A good approximation was attained for the instances with larger number of jobs which were not solved optimally. In general, we have observed that for larger instances better approximation was attained, in practice. From 50 up to 1000 jobs the guaranteed approximation has monotonically improved, reaching the factor of 1.2 for 1000 jobs. About 10% of larger problem instances with 2000 and 5000 jobs were solved optimally within one minute of the running time of our algorithm and the average approximation factor was about 1.3. Remarkably, the average approximation factor for the largest tested instances with 10,000 jobs has decreased to about 1.25.
Recall from Theorem 1 that we estimate the approximation attained by an enumerated feasible solution by the parameter , the number of the detected independent kernels with different delaying emerging jobs in the branch of tree containing that solution (the more such kernels arise, the better the ensured approximation is). In the summary table below column “” (“”, respectively) specifies the number of the emerging jobs (the maximal number of the arisen independent kernels with different delaying emerging jobs, respectively) in a branch of search tree , and is the corresponding (average) approximation factor.
Table 9 summarizes the average performance of our algorithm for the tested difficult problem instances, and
Figure 6 plots an average dependence of the approximation factor on the total number of jobs.
The reader may also have a look at the detailed tables presented in [
25] from where we may observe that the approximation factor that we guarantee for each created solution is not tight. For example, an instance might be solved optimally by our enumeration procedure, where the approximation factor that we can guarantee for any created solution cannot be 1 (see the value of
for the optimally solved instances in the detailed tables from [
25]).
We have also tested the behavior of our enumeration framework in the worst possible scenarios that we could create. For that, we have generated pseudo-random artificial problem instances from the second class of instances. They were created as the most inconvenient ones for our algorithm so that it would be forced to perform an almost complete enumeration of the candidate ED-schedules. Our aim was to verify the approximation that our algorithm could still guarantee in a reasonable time. As intended, it has failed to create an optimal solution for already quite moderate sized artificial instances. At the same time, extremely good approximation factor very close to 1 for these instances were guaranteed. Before presenting the results of the computational experiments, we describe how we have created these instances. Each artificial instance contains three different types of jobs, the same number of jobs of each type. We have generated 50 instances with , , , , , , , , and jobs. We have three types of jobs, to which we refer as the type , the type and the type jobs. Type jobs are “tight” jobs, i.e., for an -type job j, . Every type job may potentially form a kernel consisting of that single job. We have left enough space between the different jobs of type so that for any neighboring pair of type jobs with , . The release times of type jobs were generated pseudo-randomly subject to this restriction. The type and type jobs are designed to be included in between and after the (urgent) type jobs. All the type and type jobs are released at time 0. At the same time, the type and type jobs are paired in a special way; in particular, their due-dates and the processing times are determined as follows. For each pair () of a type job b and the corresponding type job c, the due-date of job b is slightly larger than that job c job, so that, whenever two or more jobs of both types are available, the ED-heuristic would take a wrong choice giving the priority to a type job. In particular, if job j is of type , and if job j is a type job. We have one type job k for each pair of two neighboring type jobs () so that it would exactly fill in the space between the jobs j and i in the optimal solution; i.e., (in addition, there is a type job with processing time that can fit before the earliest released at time type job). Type jobs are, on average, longer than type jobs according to the way how the trial interval for the random derivation of the processing times of type jobs was determined.
In
Table 10 below we illustrate an artificial instance with 12 jobs and we represent an optimal solution for that instance in
Table 11 and
Figure 7.
As we can easily see, in the optimal schedule all the type
jobs are started at their release times and the corresponding type
job is scheduled in between each pair
of the neighboring type
jobs, whereas all the type
jobs are included behind all type
and type
jobs at the end of the schedule (see
Figure 7). The optimal objective value is easy to calculate for any such instance: clearly, the maximum job lateness in the optimal schedule is 0 (it cannot be less because of the tight type
jobs). This optimal schedule, however, is difficult to create by our implicit enumeration of ED-schedules. Indeed, since type
jobs are more urgent than type
jobs, ED-heuristic repeatedly includes a type
job instead of the corresponding type
job between each two neighboring type
jobs
j and
i. So there is a type
job included between each pair of the neighboring type
jobs in the initial ED-schedule
, whereas more urgent type
jobs are included after the type
jobs. Since type
jobs are longer than type
jobs, every type
emerging job, included in between two neighboring type
jobs
j and
i yields a forced delay of a tight type
job
i. As a result, job
i either forms a kernel or becomes part of it, whereas the corresponding type
job becomes the delaying emerging job. Once activated, that type
job repeatedly gets included before another tight type
job, again becomes an emerging job and is newly activated. In each so created ED-schedule some type
job is the overflow job, whereas all the type
jobs, included before that type
job, are emerging jobs. Once all type
emerging jobs become activated, in the interval before a type
job, a “wrong” type
job might be included, e.g., one which is longer than the correct type
job corresponding to the later type
job. Then such a type
job also becomes an emerging job and gets activated. Thus both, type
and type
jobs are the potential emerging jobs, which causes the creation of an excessive number of ED-schedules in the search tree
.
In
Table 12 we summarize the average performance of our algorithm for some of the artificial instances for which we have imposed no prior restriction on the execution time of the algorithm: the execution was stopped due to the memory overflow as indicated in the column labeled by “Time”. The results for the rest of the instances of the second class can be found in Tables from [
25]. In
Table 12, the column “Instance” indicates the name of the corresponding instance, in column “Jobs” the number of the jobs is specified, columns “Width” and “Depth” indicate the width and the depth, respectively, of the solution tree
constructed for the corresponding instance, “BS Level” stands for the level in search tree
of the best obtained solution for the corresponding instance, the execution (processor) time is specified in seconds in column “Time”, the column
(“Best”, respectively) specifies the maximum job lateness in the initial ED-schedule
(in the best obtained solution, respectively); column “Completed” indicates if the best obtained solution is optimal (“Y”) or if its optimality is not guaranteed (“N”):
As we can see, the maximal number of the activated emerging jobs in a branch of tree is close to the of the total number of jobs (see the column labeled by ”E”), whereas the number of the created ED-schedules is growing very fast with the number of jobs (see column “Nodes”). As it was intended, the number of enumerated solutions for the instances of the second class turned out essentially more than that for the first class of instances. However, the approximation factor provided by our algorithm for the smallest instances with 12 jobs from the second class already reached 9/8 (although these instances were solved optimally), whereas for the larger instances, the average approximation ratio was sharply improving from for 21 jobs, for 30 jobs, for 600 jobs. Within the first 10 seconds of the execution time of our algorithm the guaranteed approximation is almost 1 for the instances with 1002 and 2001 jobs. Thus, although our algorithm was forced to enumerate a very large number of feasible ED-schedules, it has provided an extremely good approximation for the artificial instances of the second class within a very short execution time.