A Branch-and-Bound Algorithm for Minimizing the Total Tardiness of Multiple Developers

: In the game industry, tardiness is an important issue. Unlike a unifunctional machine, a developer may excel in programming but be mediocre in scene modeling. His/her processing speed varies with job type. To minimize tardiness, we need to schedule these developers carefully. Clearly, traditional scheduling algorithms for unifunctional machines are not suitable for such versatile developers. On the other hand, in an unrelated machine scheduling problem, n jobs can be processed by m machines at n × m different speeds, i.e., its solution space is too wide to be simpliﬁed. Therefore, a tardiness minimization problem considering three job types and versatile developers is presented. In this study, a branch-and-bound algorithm and a lower bound based on harmonic mean are proposed for minimizing the total tardiness. Theoretical analyses ensure the correctness of the proposed method. Computational experiments also show that the proposed method can ensure the optimality and efﬁciency for n ≤ 18. With the exact algorithm, we can fairly evaluate other approximate algorithms in the future.


Introduction
Game development is a complicated professional domain in which three limited resources, i.e., money, manpower, and time, have to be carefully managed. First, we should note that the cost of developing a multimedia game, e.g., an online game, is rising. The budget for developing a commercial multi-player game is at least $1,000,000. For some large-scale games, e.g., Grand Theft Auto V, developing a single version may even cost a company around $10,000,000 [1][2][3][4]. But once a successful game is released, it might earn a billion dollars in profit, e.g., [5,6]. In light of the above observations, game development involves considerable expertise, such as product planning, graphic design, sound design, programming, and testing. To avoid endless budget amendments, it is essential to carefully schedule all the jobs at the beginning.
Such a large game cannot be implemented by a single developer, making good teamwork another essential factor. A small-sized game may be implemented by a single designer. However, for some large-scale multimedia games, the team size may range from 3 to 100 professionals [7,8]. Each game draws upon various areas of expertise. For instance, a single piece of music requires various professional skills, e.g., composing, songwriting, dubbing, and sound effects. The professionals with these skills are sourced from different kinds of personnel pools. Some may be official company employees, while others may be temporarily recruited freelancers. Clearly, semi-finished products made by the former must be passed to the latter on schedule. If a critical job is delayed, it may leave dozens of professionals idle. Since their wages, hotel expenses, and dining fees need to be paid even during such an idle period, such costly human resources need to be carefully scheduled in advance.
The third major resource is time. These multimedia games must eventually be released onto the market, so game developers have to race against time to finish them as early as possible. Since these developers have various areas of expertise, their time costs are different. Consider two developers, Mary and Tom. Mary may be highly proficient at figure design, while Tom may excel in scene design. There are 100 figure design jobs, and both developers are qualified for these jobs. Any daily delay of a job will result in a $100 penalty. Consider further that Mary requires 30 person-days and charges $50,000; Tom takes 50 person-days and charges $30,000. To whom should we assign these jobs? How does a delayed job affect the following jobs? These jobs must be carefully scheduled in the beginning. Any delay in a critical job may cause serious damage. For scheduling such a project, the cost, time, and expertise should be considered as a whole. Clearly, it is not easy to solve such a scheduling problem by labor-intensive means. That is, such a scheduling problem in the game industry is no less challenging than those in the aviation, semiconductor, and construction industries.
In light of the above observations, it is clear that tardiness minimization is important in the game industry. In general, the jobs in a large multimedia game have various properties and different tolerance degrees to delay. For example, the job of leading figure design should be completed as early as possible-such an urgent job had better not be delayed. Conversely, late poster design or user manual translation may not cause a huge loss. If possible, all the jobs would best be completed on time. However, as in other industries, it is difficult to schedule more than 20 jobs manually. Therefore, tailored scheduling algorithms for reducing tardiness in the game industry are called for.
Assigning similar jobs to a developer with corresponding expertise helps to reduce tardiness. With the continued refinement of the game industry, developers specialize in different areas of expertise. Let us consider the above example again. Mary should be assigned figure design jobs, and Tom, scene design jobs. However, there are still some other constraints. Suppose that Mary is overloaded with a lot of figure design jobs. Although Mary is highly proficient at figure design, we had better assign some figure design jobs to Tom. Clearly, the computation of such trade-offs is very complicated. This is because multi-specialty developers are not taken into account in traditional scheduling models. Again, some new algorithms for scheduling such jobs and developers in the game industry are needed.
The following three properties distinguish the presented problem from traditional ones. First, for traditional heterogeneous machine scheduling problems, e.g., [9,10], a capable machine always outperforms others in terms of speed. However, for the presented problem, a developer may excel in figure design and programming but be mediocre in script design. That is, a single developer (or machine) simultaneously has both merits and shortcomings. It depends on what jobs are assigned to him/her. The considerations of the presented problem are more complicated than those of traditional heterogeneous machine scheduling problems.
Second, compared with identical machine scheduling problems, e.g., [11,12], the amount of computation of the presented problem is large. For m identical machines, we do not need to consider their permutations. Therefore, the solution space of the presented problem is about m! times larger than that of an identical machine scheduling problem. To our best knowledge, few researchers have focused their efforts on this emerging industry. However, the limited resources (i.e., money, time, and manpower) in this industry are seldom discussed.
Third, it is difficult to develop efficient lower bounds in a traditional unrelated machine scheduling problem, e.g., [13,14]. For n jobs, the processing speeds of m machines are all different; there are m × n various combinations, i.e., a large solution space. However, in most situations, a game developer usually processes his/her own desired jobs, i.e., Mathematics 2022, 10, 1200 3 of 24 one or two types. Such unrelated machine models are too complicated to schedule these developers in the game industry.
In this study, an optimization problem is presented. It is obvious that traditional scheduling algorithms cannot be directly applied to the problem. First, in unifunctional machine scheduling problems, a machine usually processes jobs at a fixed speed, e.g., a welding robot. In the presented problem, the processing speed is determined by the fitness between developers and job types. That is, the combinations that are needed to be considered become greater in number. Second, jobs with agreeable processing times and due dates, e.g., [15], are commonly employed to develop lower bounds and minimize tardiness. However, this technique will lead to an anomaly. Consequently, we propose an exact algorithm to schedule these various jobs and versatile developers in the game industry. Two main contributions are made in this study. First, a branch-and-bound algorithm is proposed for ensuring the optimality for n ≤ 18. Second, a lower bound based on a harmonic mean is developed to improve the execution efficiency.
The rest of this study is organized as follows. In the Section 2, past research is introduced. In the Section 3, the scheduling problem considering versatile developers is formulated. In the Section 4, a lower bound and a branch-and-bound algorithm are developed. In the Section 5, experiments are conducted to show the execution efficiency of the proposed algorithms. Conclusions are drawn in the Section 6.

Related Work
In this section, the motivations for tardiness minimization in the game industry are introduced. Moreover, the differences between the presented problem and traditional ones are also discussed.

Game Industry
Game development requires effective control of manpower, money, and time. Manpower plays a vital role in this industry. Unlike ordinary industries (e.g., lumbering), the modern game industry is dependent on versatile developers cooperating to develop their products. These developers may be first-party designers (e.g., Nintendo), second-party developers (e.g., [16]), or even third-party participants (e.g., [17][18][19]). Each of them may be multi-functional, able to deal with several kinds of jobs. This implies that scheduling these developers is more complicated than scheduling unifunctional machines in traditional industries. On the other hand, human resources in the game industry are expensive. From 2007 to 2018, the annual salaries of these developers increased from $66,000 to $73,000 [20,21]. Moreover, the team sizes range from a few to a hundred professionals, and the members of a team may be geographically distributed across two or three continents [7,8]. Although the scale of a game project is not always so large, the amount of computation for scheduling such a game project is still amazing. Especially for some critical jobs, tardiness can lead to heavy penalties [22,23]. Poorly managed projects may result in missed deadlines, cost overruns, reworking, or complaints. For more management failures, please refer to [24].
Incurring high costs is a not a rare phenomenon in the game industry. Due to advancements in technology, the plot of an online game may be more complicated than that of a motion picture, and the settings of a large game may be more fantastic than any scenic spot in the real world. Consequently, the costs of some well-known games, e.g., Call of Duty ($250 million), are higher than those of some science fiction films [1,25]. Moreover, a multimedia game is usually developed by a team instead of a single designer. Consequently, any delay may keep dozens of developers idle, which may entail further expenditures on wages, hotel expenses, and dining fees. In fact, due to bad control of budgets, most commercial games did not earn profits [7,26,27]. Such failures imply that effective budget control is an important issue in the game industry. For more information about game budgets and revenues, readers can refer to [2][3][4][5][6]8,[28][29][30][31].
Time management is inevitable in the game industry. Both the game industry and traditional industries require massive capital investment, but the game industry has a special feature: time effectiveness. If a new house is 10 days late for the market, its price will not change greatly. However, if the official release of a commercial game is postponed and a rival gets ahead of the game, no players will consider the product of the loser because the novelty soon wears off. Moreover, with big data, we are able to predict or estimate a developer's behaviors at the operational level easily. For example, all the technological processes of a developer can be observed and recorded, e.g., processing time, job type, failure probability; these data can be established as a database or a smart factory and such experiences can be repeatedly accessed and utilized [32][33][34]. In light of these observations, we learn that it becomes more possible and more necessary to punctually manage a large game project than before.
In summary, the game industry has grown to a considerable scale. It is impossible for a single designer to implement a large-scale multimedia game. Consequently, more efficient and effective management of manpower, money, and time is needed, rather than traditional labor-intensive tools.

Total Tardiness
In traditional multi-machine scheduling problems, the objective is usually to minimize the total tardiness, i.e., the sum of all jobs' tardiness. Each job is tagged with a due date. Once a job is delayed, the objective cost increases. Moreover, in general, the tardiness of a job may lead to its successors' tardiness. That is, there might be a ripple effect, in which the delay of a small job can affect the whole project. This means that the amount of computation of scheduling is huge, especially for multi-machine scheduling. For example, Mensendiek et al. [35] aimed to minimize the total tardiness of all the jobs on identical machines. They proposed a branch-and-bound algorithm for generating the optimal solutions and a metaheuristic algorithm, i.e., a genetic algorithm for obtaining approximate solutions. Due to the NP-hardness of this problem, the branch-and-bound algorithm performed well only for n ≤ 18, where n means the number of jobs. Wang [9] considered a total minimization problem on heterogeneous machines. A branch-and-bound algorithm was developed to ensure optimality for n ≤ 18. Note that m identical machines are much easier to schedule than m heterogeneous machines. For m identical machines, we only focus on how to permute n jobs, and we do not need to consider how to permute these identical machines. For example, in [13,36,37], they solved easier multi-machine scheduling problems. The reasons are stated as follows. First, since these machines are the same, the number of all the possible solutions for identical machines is just about 1/(m!) of that for heterogeneous machines. Second, each identical machine processes all jobs at a fixed speed. However, in our presented problem, a developer can perform jobs at different speeds. It depends on the type of each job. These differences imply that scheduling heterogeneous machines is more difficult. For more references to tardiness minimization, we can refer to [11,[38][39][40][41].
Developing a large-scale game may involve thousands of jobs and hundreds of developers. Without proper scheduling algorithms, some jobs may be tardy. More seriously, a ripple effect will cause more jobs to be delayed. For example, tardiness will lead to low return rate and poor customer satisfaction [42,43]. Consequently, it is worthwhile to develop scheduling algorithms to minimize the total tardiness in the game industry.

Branch-and-Bound Algorithm
Branch-and-bound algorithms always generate the optimal solutions. These solution techniques are used for solving discrete and combinatorial optimization problems. Their great merit is their optimality, whereas their shortcoming is time consumption. Consequently, they are employed only for a small problem. For example, for using branch-andbound algorithms to minimize total tardiness on a single machine, the maximal problem sizes for [44][45][46] are 18, 20, and 25, respectively. For minimizing the total tardiness on identical machines, the maximal problem sizes for [11,47] are 10, and 25, respectively. If machines are heterogeneous, the optimally solvable problem size of a branch-and-bound algorithm will decrease, e.g., n = 15 in [43] and n = 18 in [9]. This is because the solution space of a heterogeneous machine scheduling problem will be m! times larger than that of an identical machine scheduling problem. Even so, many researchers have focused on developing branch-and-bound algorithms in traditional industries. Unlike traditional industries, however, the modern game industry has various jobs and versatile developers. Namely, more efficient algorithms are needed for the complicated situation in the game industry. However, such exact algorithms are very time-consuming. For some tardiness minimization problems, e.g., an unrelated machine scheduling problem [48], their lower bounds were obtained by directly solving two max-min sub-problems, i.e., brute-force search. Consequently, some efficient lower bounds for accelerating branch-and-bound algorithms are also required. Table 1 lists relevant studies that employ branch-and-bound algorithms to achieve their objectives in different environments. We divide these environments into two types: unifunctional machines and multi-functional ones. For example, a pump is a unifunctional machine used only for water distribution in [49]. Furthermore, branch-and-bound algorithms for scheduling unifunctional machines can be subdivided into three subtypes. First, the algorithms designed for a single machine are too simple to schedule multiple machines. Second, the branch-and-bound algorithms for identical machines still have their limitations, since they do not consider the permutations of different machines when optimizing. For example, since machines in [45] are identical, their position orders do not influence the design of a branch-and-bound algorithm. However, in this study, the situation is complicated by various developers. Third, even for heterogeneous machines, they process jobs of the same type only. For example, a powerful asphalt milling machines in [43] is still not able to lift containers. For each heterogeneous machine, only one fixed processing speed is considered. However, a developer in the game industry can process jobs of different types, e.g., programming and songwriting; hence, one developer may process jobs at several different paces. That is, a branch-and-bound algorithm for this study needs to search a larger solution space than other branch-and-bound algorithms for heterogeneous machines. A multi-functional machine can process jobs of multiple types, i.e., different speeds. For unrelated machines, they can perform various jobs, and hence there are m × n different speeds, where m is the number of machines and n is the number of jobs. Suffering from no regular patterns summarized and deduced from such many relationships, some researchers had abandoned their attempt to develop an exact algorithm, e.g., [14]. Consequently, only a few studies focused on developing branch-and-bound algorithms for these unrelated machines. On the other hand, in the game industry, developers requiring preparations are relevant to unrelated machines with uncertain setup times. However, jobs regarding game development can be categorized into few types and only few relationships need to be taken into account. That is, an unrelated machine model is too complicated to develop an efficient lower bound. Some new exact algorithms for scheduling these versatile developers are thus required.
Moreover, the optimal solutions generated by branch-and-bound algorithms can be good benchmarks for evaluating metaheuristic algorithms. Despite the high number of jobs in game development, these jobs may be merely scheduled in Microsoft Excel or recorded in Google Calendar. Clearly, such a handmade itinerary is not qualified to be a benchmark for evaluating other methods. To our best knowledge, no researchers have investigated similar tardiness minimization problems in the game industry. A likely reason is that traditional grey-media PC games are simple and can be implemented by a few amateurs. Nowadays, however, the tide of the game industry is turning. Scheduling the jobs of a multimedia game project can be very complicated; hence, we need exact solutions to evaluate other metaheuristic algorithms.

Problem Formulation
The optimization problem is formulated as follows. There are n non-preemptive jobs and m developers. Each job j has a default processing time p j , a due date d j , and a job type e j ∈ {1, 2, 3} for j = 1, 2, . . . , n. For each job type x, developer a has a processing difficulty ratio r ax for a ∈ {1, 2, . . . , m} and x ∈ {1, 2, 3}. That is, if job j of type x (i.e., e j = x) is assigned to developer a, the actual processing time is p j r ax . Each job needs to be assigned to one and only one developer, and each developer can process only one job at a time. On the other hand, if job j is assigned to developer a according to a schedule π, the actual completion time is denoted by C j@a (π) and the tardiness is defined by T j@a (π) = max 0, C j@a (π) − d j . Under the above assumptions and constraints, we aim to determine an optimal schedule π * which minimizes the total tardiness; i.e., the minimization problem is defined by where f (π) means the objective function.
A problem instance is shown in Figure 1a. Let n = 5, m = 2, p j = 20, 10, 30, 6, 10, d j = 20, 20, 50, 70, 10, and e j = 1, 1, 2, 3, 3, for j = 1, 2, 3, 4, 5. The processing difficulty ratios are listed in Figure 1b. Let π = (1, 2, 4, 0, 5, 3) be a schedule, where number 0 means a separator used to divide jobs between developers. Since developer 1 is highly proficient at dealing with jobs of type 1, job 1 and job 2 are assigned to developer 1, and their processing times are 20 and 10, respectively. Similarly, since developer 2 excels at processing jobs of type 2, job 3 is processed by developer 2, and its actual processing time is 30. Note that neither developer is skilled in processing jobs of type 3 (i.e., jobs 4 and 5). Since job 5 has an early due date, let developer 2 process it first, and its actual processing time is 30 (= 10 × r 23 = 10 × 3). Similarly, developer 1 requires a processing time of 30 (= 6 × r 13 = 6 × 5) to process job 5. Eventually, the total tardiness is f (π) = 40 (i.e., 0 + 10 + 10 + 0 + 20).  It is clear that the above scheduling problem is different from traditional ones. The following features differentiate the presented problem from traditional ones: • Compared with traditional unrelated machine scheduling problems, the concept of job type can reduce the amount of computation. For example, all the relationships It is clear that the above scheduling problem is different from traditional ones. The following features differentiate the presented problem from traditional ones: • Compared with traditional unrelated machine scheduling problems, the concept of job type can reduce the amount of computation. For example, all the relationships between machines and jobs must be taken into account, e.g., the probability of machine i processing job j (p ij ) in [13] and the processing time of machine i processing job j (p ij ) in [14]. All the m × n combinations must be considered. If a job set is not given, blindly estimating each machine's average processing speed cannot determine a good lower bound. However, in the presented problem, jobs can be categorized into three types and only m × 3 processing speeds are considered and their average processing speeds can be employed to develop a lower bound.

•
In past heterogeneous machine scheduling models [41,69,71], a capable developer (or machine) always outperforms others in terms of processing speed. That is, each heterogeneous machine has its own fixed speed. However, in this presented problem, a developer might be mediocre in processing jobs of type 1 but excel in dealing with jobs of other types. Clearly, these developers cannot be modeled by such unifunctional machines.

•
Compared with traditional identical machine scheduling problems, e.g., [11,12], the presented problem is more difficult. An example is given in Table 2. Consider that we allocate the three jobs to three identical machines. It is obvious that there is only one schedule, and it is just the optimal schedule; i.e., each machine takes one job. However, in the presented problem, a capable developer might take all three jobs to achieve optimality. Consequently, we need to check all the possible situations listed in Table 2 to determine the optimal schedule. • Traditional tardiness minimization techniques cannot be directly applied to this problem. Jobs with larger processing difficulty ratios may take precedence over jobs with earlier due dates. In the presented problem, the processing difficulty ratio, processing time, and due date should be considered as a whole.

Job Assignment Number of Possible Schedules
one developer takes three jobs; two developers are idle C 3 1 3! one developer takes two jobs and another one takes one job; the remaining one is idle C 3 1 2!C 2 1 1! each developer takes one job C 3 1 C 2 1 C 1 1 total 36 In light of the above observations, traditional scheduling algorithms cannot be directly applied to this problem, and a new optimization algorithm is thus required. That is, if a project meets the following two criteria, manpower can be arranged by such algorithms. First, a project is interdisciplinary and it recruits cross-domain workers, no matter what kinds of workers it needs, e.g., employee or freelancer. A developer may acquire several competencies and can perform several kinds of jobs in the project. Second, the performance pattern of each developer is known. That is, we have the big data of all developers and can estimate each one's processing time for processing a given kind of job [32][33][34].

Branch-and-Bound Algorithm
In this section, we develop a branch-and-bound algorithm (named BB). To obtain the optimal schedules, BB will explore each search tree in the depth-first-search (DFS) order. Moreover, to deter us from exploring useless partial schedules, we also propose some dominance rules and develop a lower bound.

Dominance Rules
For convenience, we introduce some notations at the beginning of developing dominance rules. Let π = (α, β) be an undetermined schedule, where α is a determined partial sequence and β is the undetermined part. We wonder if there exists any better schedule π that outperforms π. Consequently, some dominance rules are developed to prove our doubts. Since these rules are similar, we provide only the first proof.
Case I: Consider that jobs i and j are the last two jobs of α and both jobs are assigned to the same developer a. Let π be the schedule obtained by only interchanging the last two jobs i and j in α. For simplicity, let C i@a (π) = t i , C j@a (π ) = t j , and C j@a (π) = t j = C i@a (π ) = t i . In the following rules, both jobs i and j in π are tardy. However, if we interchange both of them, their tardiness can be alleviated a little. In Rule 1, though both jobs are still tardy in π , their resulting tardiness is lower than π. In Rule 2, the interchange makes job j not tardy, i.e., the resulting tardiness is reduced.
Proof. We prove this property by showing T i@a (π) + T j@a (π)>T i@a (π ) + T j@a (π ). That is, The proof is complete.
In the following four rules, job j is tardy and job i is not in π. Rules 3 and 4 show that job j can be done in time in π and the resulting tardiness can be improved. In Rules 5 and 6, job j is still tardy in π , but the accumulated tardiness can be alleviated.
Rule 7 lets the job with an earlier due date be processed first if both jobs, i.e., i and j, are not tardy in π. That is, both objective costs of π and π are the same, and we can stop searching for one of them.
Case II: Consider that job i is the last job of α, which is assigned to developer a, and job j can be any undetermined job in β. Moreover, job i is also the last job assigned to developer a. For simplicity, let C i@a (π) = t i and e j = y ∈ {1, 2, 3}. In Rule 8, it would be wasteful to assign very few jobs to developer a. That is, he/she can accept some extra job in β if no tardiness occurs.

Rule 8.
If t i + p j r ay ≤ d j , then π dominates π.
Case III: Consider that job i is the last job of α, which is assigned to developer a, and job j can be any undetermined job in β. Let e i = x ∈ {1, 2, 3}, e j = y ∈ {1, 2, 3}, and π be the schedule obtained by interchanging job i in α and job j in β. For simplicity, let C i@a (π) = t i and C j@a (π ) = t j . In Rule 9, we interchange job i in α and job j in β if job j is more urgent and the total tardiness will not deteriorate. In Rule 10, developer a is mediocre at processing jobs of type x but highly proficient at processing jobs of type y. On the other hand, all the remaining developers excel at dealing with jobs of type x and are mediocre at processing jobs of type y. Therefore, we interchange jobs i in α and j in β. Note that the total tardiness will not be worse in the case.
The following lemma shows that each developer's workload has a squeeze effect. That is, if there exists a developer whose workload is unreasonably heavy, then there must be another developer who has a relatively light workload. Due to space limitations, the following proofs can be found in Appendix A.
Lemma 1. For a schedule π, if there exists a developer a whose maximum completion time is larger than ∑ n j=1 max m i=1 p j r ie j /m+max m i=1 max n j=1 p j r ie j , there exists another developer b whose maximum completion time is less than (∑ n j=1 max m i=1 p j r ie j )/m.
The following rule can help us to avoid some unnecessary searches if any developer is overloaded. If there exists a developer whose maximum completion time is unreasonably long, then we can remove a job from the overloaded developer and assign it to a half-loaded developer. That is, the previous schedule is dominated.

Rule 11.
For an optimal schedule π * , each developer's maximum completion time is less than or equal to ∑ n j=1 max m i=1 p j r ie j /m + max m i=1 max n j=1 p j r ie j .

Lower Bound
A lower bound is needed to avoid unnecessary searching if we are in the middle of a schedule that is dominated or outperformed by others. That is, after adding up the determined cost and the estimated cost of the remaining part, if the sum is still larger than the current minimal objective cost, we can abandon further searches for the remaining part. Consequently, the earlier we can stop useless searches, the more execution time we can save.
Arranging jobs with agreeable processing times and due dates is a useful way to obtain a lower bound for traditional tardiness minimization problems [15]. Here agreeableness is a kind of job correlation. As with precedence between two jobs in which a successor (e.g., testing) cannot start until a predecessor (e.g., programming) has finished, agreeableness between any two jobs implies that the smaller job (i.e., less processing time) always has an earlier due date, i.e., another kind of job correlation. However, it may lead to some anomalies in our problem. Consider that two identical developers (or machines) deal with the two jobs shown in Figure 2a. To obtain a lower bound, in Figure 2b, traditional algorithms may adjust the processing times and due dates and make them agreeable, i.e., d (i) ≤ d (j) if and only if p (i) ≤ p (j) . Hence, the lower bound for these jobs is 0; i.e., max{0, 4 − 6} + max{0, 6 − 8}. However, in our problem, an anomaly occurs. Let e 1 = 1, e 2 = 2, and the processing difficulty ratios are shown in Figure 2c. For the original jobs shown in Figure 2a, the optimal objective cost is 4; i.e., max{0, 3 × 4 − 8} + max{0, 1 × 6 − 6}. For these virtual jobs shown in Figure 2b, the lower bound is 6; i.e., max{0, 3 × 4 − 6} + max{0, 1 × 6 − 8}. However, a lower bound is never larger than the minimal cost. Consequently, we cannot directly apply this technique here. In our problem, a job with a larger processing difficulty ratio may be more urgent than another with an earlier due date. That is, the processing times, due dates, and processing difficulty ratios should be considered as a whole in this problem.
− 6} + max{0, 6 − 8}. However, in our problem, an anomaly occurs. Let 1 1 e = , 2 2 e = , and the processing difficulty ratios are shown in Figure 2c. For the original jobs shown in Figure 2a, the optimal objective cost is 4; i.e., max{0, 3 × 4 − 8} + max{0, 1 × 6 − 6}. For these virtual jobs shown in Figure 2b, the lower bound is 6; i.e., max{0, 3 × 4 − 6} + max{0, 1 × 6 − 8}. However, a lower bound is never larger than the minimal cost. Consequently, we cannot directly apply this technique here. In our problem, a job with a larger processing difficulty ratio may be more urgent than another with an earlier due date. That is, the processing times, due dates, and processing difficulty ratios should be considered as a whole in this problem.   Since these developers differ in their abilities (i.e., various processing difficulty ratios), we aim to fabricate an equivalent substitute to replace these heterogeneous developers in the real world. Consider that there are k different developers in the real world and assume that there are k virtually identical developers whose integrated ability is just equal to the sum of all the real ones' abilities. The following definition gives the correct magnitude of processing difficulty ratio for each virtual developer. It is interesting that the magnitude is the harmonic mean of all the real developers' processing difficulty ratios.
Definition 1. There are k available developers numbered from m − k + 1 to m in the real world. Let there be k virtually identical developers. For each job type x, the equivalent processing difficulty ratio of each virtual developer is k/(1/r m−k+1,x + 1/r m−k+2,x + . . . + 1/r mx ) and denoted by k r x .
The following lemma shows that the throughput (i.e., the amount of work per unit time) of these virtual developers is the same as the sum of all the real ones' throughputs. For more information about harmonic mean, readers can refer to [72,73].

Lemma 2.
For a given job type x, the sum of the last k real developers' throughputs (i.e., ∑ m a=m−k+1 1/r ax ) is equivalent to that of the k virtual ones' throughputs (i.e., k/ k r x ).
Now we can merge these virtual developers into a virtual substitute. The following definition gives the correct magnitude of the processing difficulty ratio of the single substitute. Moreover, the following lemma shows that the throughput of the k virtual developers is exactly equal to that of the virtual single substitute.

Definition 2.
There are k available developers numbered from m − k + 1 to m in the real world. Let there be only one virtually equivalent developer, called the substitute. For each job type x, the processing difficulty ratio of the substitute is 1/(1/r m−k+1,x + 1/r m−k+2,x + . . . + 1/r mx ) and denoted by k r x , i.e., k r x = k r x /k.

Lemma 3.
For a given job type x, the sum of the last k real developers' throughputs (i.e., ∑ m a=m−k+1 1/r ax ) is equivalent to that of the substitute's throughput (i.e., 1/ k r x ).
For each different job type, the virtual substitute still has different processing difficulty ratios. The following definition provides the upper and lower limits of the processing difficulty ratios for the substitute. The following lemma shows the boundary of the k real developers' throughputs. This lemma guarantees that the throughput of the virtual substitute is larger than or at least equal to that of the k real developers.

Definition 3.
For each virtual developer transformed by k available real developers, let k r min = min k r 1 , k r 2 , k r 3 and k r max = max k r 1 , k r 2 , k r 3 denote his/her minimal processing difficulty ratio and maximal processing difficulty ratio, respectively. Lemma 4. The throughput of the last k real developers is less than or equal to1/ k r min .
Algorithm 1 shows the algorithm of the proposed lower bound (named LB). Note that BB explores a schedule π = (α, β) in the DFS order. Since the jobs in α are determined, we let some job k be the last job of α, i.e., lj(α)=k, and some developer a process it, i.e., ld(α)=a. Hence, there are m − a available developers before C k (π). By Lemma 4, we can regard the m − a developers as a virtual developer with processing speed 1/ m−a r min . Similarly, the m − a + 1 developers before C k (π) can be viewed as another virtual developer with processing speed 1/ m−a+1 r min . In Step 1, we determine which job is the last job in α and which developer completes the job. In Steps 2-3, the jobs in β are transformed into n − l new jobs whose processing times and due dates are agreeable, i.e., p (i) ≤ p (j) if and only if d (i) ≤ d (j) . This modification ensures that such a lower bound will not be larger than the actual optimal cost [74,75]. Then we allocate these jobs to the two virtual developers at the pace of a unit job in Steps 5-14. We preemptively allocate the transformed jobs and start from time 0. If LB proceeds before time C k (π), we allocate the workload to the first virtual developer (Steps 9-10); otherwise, we let the second virtual developer process the remaining part (Steps 12-13). In Step 15, the estimated tardiness is accumulated, if any. Finally, the estimated lower bound is returned. Algorithm 1. The proposed lower bound (LB(π,l)).
Theorem 1 shows the correctness of the proposed lower bound. By Theorem 1, the object cost of the substitute (i.e., the lower bound) will not be larger than the actual optimal cost in the real world.

Branch-and-Bound Algorithm
Given the above dominance rules and lower bound, a branch-and-bound algorithm (named BB) is therefore developed and shown in Algorithm 2. The exact algorithm recursively explores the solution space in a DFS manner. Every time we enter the recursive algorithm, we check if the current partial sequence α of length l is dominated or the current lower bound is larger than the up-to-the-minute lowest cost (Step 1). If not, BB recursively calls itself (Steps 5-8). Since there are still n − l + 1 undetermined jobs in β, we make n − l + 1 new subsequences, and each starts with a different leading job. Then, we repeatedly replace the original β and obtain n − l + 1 new schedules (Steps 5-7). At the end, BB is recursively called by itself for n − l + 1 times (Step 8). Note that both cost * and π * are global variables. When the recursive algorithm ends, the globally optimal schedule and the minimal cost are stored in both of them. Algorithm 2. The proposed branch-and-bound algorithm (BB(π,l)).
So far, we have proposed an exact algorithm named BB for locating the optimal solutions. With the aid of the dominance rules and lower bound, BB does not need to search for the entire solution space. Some dominated branches can be omitted, and the execution time is thereby reduced.

Experimental Results
In this section, we will observe the performance of the proposed branch-and-bound algorithm for n ≤ 18 and the efficiency of the proposed lower bound for n ≤ 12. Moreover, sensitivity tests are performed to show the influence of each control parameter. All the proposed algorithms are implemented in Pascal and executed on an Intel Core i7 @ 3.40 GHz with 8 GB RAM in a Windows 7 SP1 environment. For each setting, 50 random trials are conducted, and their execution times are measured in seconds. Finally, experimental results are discussed and compared.

Computational Results
We conduct experiments to observe the performance of BB and LB, and we show how the parameters (e.g., n) affect the objective costs of this problem. Table 3 lists all the parameters used in this section. Parameters m, n, p j , and d j have already been defined in Section 3. To model different job types, we let n j be the number of jobs of type e j for j = 1, 2, 3, where n 1 + n 2 + n 3 = n. To realize different processing difficulties, we let there be three kinds of developers in the following experiments. The first kind means average developers, i.e., r ax ∈ {4, 5, 6, 7}. The second kind means uni-specialty experts who excel in only one arbitrary job type, i.e., r ax ∈ {1, 2, 3}; however, for the other two job types, their processing difficulty ratios are in {4, 5, . . . , 10}. The third kind means bi-specialty experts who are highly proficient at two arbitrary job types with processing difficulty ratios less than or equal to 3; for the remaining job type, their processing difficulty ratios are in {4, 5, . . . , 10}. Now we let m i be the number of the ith kind developers for i = 1, 2, 3, where m 1 + m 2 + m 3 = m. Moreover, we let T be the total default processing time and use τ and R to control p j and d j such that they follow two discrete uniform distributions, respectively, i.e., p j ∼ DU(1, 100) and d j ∼ DU(T(1 − τ − R/2)/m, T(1 − τ + R/2)/m).  For clarity, the experiments are divided into three parts. In the first part, we observe the performance of BB. Table 4 shows the performance of BB when the problem size is small, i.e., n = 12. Note that all the other parameters are set to their default values. Clearly, the execution time increases if we add an extra developer, no matter what kind of developer he/she is. It implies that m also affects the execution time. On the other hand, τ influences the execution time more greatly than R. When all the jobs have earlier due dates, i.e., a large τ, each urgent job competes for limited resources, i.e., the m developers, more intensively. Consequently, BB will consume more execution time.  Table 5 shows the performance of BB when we have a fixed number of developers, i.e., m = 3. For the fixed numbers of developers and jobs, job type does not affect BB's performance. Even if there are only one job type and one developer type, BB will spend the same execution time to solve the problem. Unless all the m developers degenerate into the same developer type with the same processing difficulty or the n jobs degenerate into the same jobs with the same processing time and due date, the problem will not become easy.  Table 6 shows the performance of BB when the problem size is medium, i.e., n = 15. At the beginning, we let the setting m 1 = m 2 = m 3 = 1 be a benchmark for later observations. The column of NA means the number of problem instances unsolved within a hundred million nodes. Again, the more developers we have, the more difficult the problem becomes. However, for all the settings with m 1 + m 2 + m 3 = 4, the results reveal that the problem instances having later due dates (τ = 0.25) are easier to solve. Most of them can be solved within a hundred million nodes. Table 7 shows the performance of BB when the problem size is large, i.e., n = 18. Again, for all the settings with m 1 + m 2 + m 3 = 4, the problem instances having later due dates are still easier to solve than others. Compared with similar total tardiness minimization problems on identical machines, e.g., [76], the proposed branch-and-bound algorithm performs well for versatile developers. In [76], the maximum problem size that a branch-and-bound algorithm can solve is n = 20. Note that their machines are identical and unifunctional for processing the same kind of jobs. As discussed earlier, a permutation problem of m identical machines and n jobs is much easier than ours. The reason is that its solution space is just 1/(m!) of that of an m-heterogeneous-machine scheduling problem. On the other hand, BB is also compared with a metaheuristic algorithm, i.e., GA [77]. The relative error percentage (REP) is defined as (f GA − f BB )/(f BB ) × 100%, where f means an objective cost. In some situations, although GA takes only 0.02 s, it usually converges at local minimums prematurely; and its objective costs might be 512 times larger than the optimal ones. It implies that an approximate algorithm cannot ensure solution quality even for n = 18 only. In light of the above comparisons, we learn that n = 18 is a proper problem size to observe the performance of a branch-and-bound algorithm for solving such a total tardiness minimization problem for versatile developers. In the second part, we analyze the efficiency of the proposed lower bound. To show the performance of LB, we add an extra branch-and-bound algorithm without the aid of LB and compare it with the original BB. Table 8 shows the performances of two branch-and-bound algorithms for n = 12. In general, the original BB only takes 13.86% of the execution time of the modified BB. It is clear that the proposed LB based on the harmonic mean can effectively prune unnecessary nodes and reduce execution time. In the third part, three control parameters are adjusted to observe their influences on objective cost and execution time. A sensitivity test of p j and d j is shown in Figure 3. In this experiment, we set m 1 = m 2 = m 3 = 1 and n 1 = n 2 = n 3 = 5 to simulate an average case. Other parameters are set to their default values. Intuitively, objective cost decreases if each job's processing time is reduced. For example, a 15% decrease in each job's processing time can achieve a 50% decrease in objective cost, where −50% means cost reduction. On the other hand, objective cost decreases if we can postpone each job's due date. For example, a 15% increase in each job's due date leads to a 35% decrease in objective cost. In the real world, it is not easy to compress the processing time of each job. However, we can negotiate with our customers to postpone a job's due date. It is worthwhile to postpone the due date by 15% and achieve a 35% cost reduction.
In Table 9, we perform another sensitivity test on a limited resource, i.e., developers. Let m 1 = m 2 = m 3 = 1 be a benchmark setting. Though an add-on developer can be regarded as a creditable resource, it will increase the execution cost intensively. From the viewpoint of run time, the number of developers (m) is also a kind of problem size and directly affects the performance of BB adversely. However, from the viewpoint of objective cost, a bi-specialty developer can perform more jobs than a uni-specialty developer and reduce tardiness more. Clearly, such a versatile developer cannot be replaced by a traditional unifunctional machine. That is, these versatile developers make this model closer to the real world. Such findings distinguish our scheduling problem from traditional scheduling problems.   Figure 6 shows how j p and j d affect execution time. If we advance each job's due date (e.g., -15%), we are going to be working to tight schedules, and BB requires more execution time (e.g., 35.4%) to obtain the optimal solutions. Or if each job has a larger due date (e.g., 15%), BB requires less execution time (e.g., -36.74%). This is because most jobs can be completed within their due dates. That is, BB can easily achieve zero or little tardiness, and less computing is needed. On the other hand, if the processing time of each job is lengthened, it means that the durations of jobs are very likely to overlap with each other and BB needs more execution time to schedule them. In general, the default processing  Figure 4 shows how p j and d j affect execution time. If we advance each job's due date (e.g., −15%), we are going to be working to tight schedules, and BB requires more execution time (e.g., 35.4%) to obtain the optimal solutions. Or if each job has a larger due date (e.g., 15%), BB requires less execution time (e.g., −36.74%). This is because most jobs can be completed within their due dates. That is, BB can easily achieve zero or little tardiness, and less computing is needed. On the other hand, if the processing time of each job is lengthened, it means that the durations of jobs are very likely to overlap with each other and BB needs more execution time to schedule them. In general, the default processing time of a job is determined and fixed; however, its due date may be negotiable. It implies that bargaining for a later due date can simultaneously benefit the objective cost and the execution time.  Figure 6 shows how j p and j d affect execution time. If we advance each job's due date (e.g., -15%), we are going to be working to tight schedules, and BB requires more execution time (e.g., 35.4%) to obtain the optimal solutions. Or if each job has a larger due date (e.g., 15%), BB requires less execution time (e.g., -36.74%). This is because most jobs can be completed within their due dates. That is, BB can easily achieve zero or little tardiness, and less computing is needed. On the other hand, if the processing time of each job is lengthened, it means that the durations of jobs are very likely to overlap with each other and BB needs more execution time to schedule them. In general, the default processing time of a job is determined and fixed; however, its due date may be negotiable. It implies that bargaining for a later due date can simultaneously benefit the objective cost and the execution time.

Discussion
For traditional industries, a welder does not in general perform a spray job. Today, arranging for a single worker to perform different kinds of jobs has become fairly common among modern industries, e.g., games or movies. Developing a multimedia game heavily involves job scheduling, personnel management, time control, and cost reduction. Therefore, we present an interesting scheduling problem to deal with human resource management in the game industry. For example, tardiness is an important issue mainly caused by human factors. In general, for parallel machine scheduling, an acceptable problem size is about 25, e.g., [11,47]. Since machines are identical, no permutation of machines is needed. For versatile developers, such an optimization problem will become more difficult, and the problem size that can be optimally solved will be smaller. This is because we must take all the permutations of developers into account.
This study can be distinguished by the following five features. First, unifunctional machines are replaced by cross-domain developers and this change makes the model more realistic. Second, such scheduling algorithms are cost-effective. Compared with enhancing computer hardware, job scheduling is a less expensive way to control budgets. Third, we propose a lower bound based on harmonic mean that can prevent the anomaly from happening. Fourth, for some total tardiness minimization problems over heterogeneous machines, e.g., [9,43], their maximal solvable problem sizes for brand-and-bound algorithms are about 25. Note that such problems are easier, i.e., each machine always processes jobs at a fixed speed. On the other hand, the experiments show that the proposed brand-and-bound algorithm can optimally solve this problem for n different kinds of jobs and m heterogonous developers, i.e., n = 18 and m = 4. In this study, we need to consider each developer's processing difficulties for different kinds of jobs. It implies that the presented problem is more difficult, and hence problem size 18 is a considerable achievement. Fifth, the optimal solutions obtained by the proposed algorithm can be used as fair benchmarks for evaluating other metaheuristic algorithms. Moreover, for other industries, we can apply the algorithm to other industries if they have similar needs for human resource management.

Conclusions
Today, a modern game is completed by multiple versatile developers and its tardiness should be reduced as much as we possibly can. Clearly, unifunctional machine scheduling is not suitable for this problem since developers can process jobs of different types. On the other hand, unrelated machine scheduling considers m × n processing speeds, i.e., too complicated to fit the presented problem. Consequently, we present an efficient branchand-bound algorithm to optimally solve this problem.
In this study, to develop a branch-and-bound algorithm, we first analyze the properties of the problem and establish some mathematical theories for the branch-and-bound algorithm. Two main contributions are made in this study. First, this exact algorithm achieves optimality by coordinating each developer's multiple abilities. Second, a lower bound based on a harmonic mean is developed to avoid the anomaly. The experiments show that the proposed algorithm performs well for 18 jobs and 4 developers. That is, it can be employed as benchmarks for evaluating metaheuristic algorithms when problem sizes are less than or equal to 18.
The proposed exact algorithm is relatively efficient, but it still has limitations. Some future research directions are suggested as follows.

•
A lower bound based on non-preemptive techniques is worth exploring in greater detail. This is because preemption might lead to underestimation of a lower bound. • A high-quality metaheuristic algorithm is still needed. For a real-world instance, BB might take several hours to generate the optimal schedules. In the near future, we can develop some approximate algorithms to solve large problem instances near optimally, e.g., n = 100. • Hybridization might improve efficiency. If a high-quality metaheuristic algorithm is developed, an exact algorithm can start searching from a near-optimal solution obtained by the metaheuristic algorithm. That would be helpful to improve execution speed. With such a hybrid exact algorithm, we can evaluate other approximate algorithms objectively and precisely.

Conflicts of Interest:
The authors declare no conflict of interest.
Appendix A Lemma A1. For a schedule π, if there exists a developer a whose maximum completion time is larger than ∑ n j=1 max m i=1 p j r ie j /m+max m i=1 max n j=1 p j r ie j , there exists another developer b whose maximum completion time is less than (∑ n j=1 max m i=1 p j r ie j )/m.
Proof. We prove this property by contradiction. Suppose the remaining m − 1 developers have maximum completion times that are all larger than or equal to (∑ n j=1 max m i=1 p j r ie j )/m. Then, we sum up all the maximum completion times of the m developers. That is, ∑ m i=1 max C j@i (π) >∑ n j=1 max m i=1 p j r ie j /m+max m i=1 max n j=1 p j r ie j + (m − 1)(∑ n j=1 max m i=1 p j r ie j )/m.
On the other hand, consider the worst situation, in which each job j is assigned to its worst matched developer; i.e., each job j consumes the maximum processing time max m i=1 p j r ie j . Consequently, the sum of all the maximum completion times of all the m developers in schedule π is less than or equal to the total worst processing times of all the n jobs. That is,

Now we have
∑ n j=1 max m i=1 p j r ie j /m + max m i=1 max n j=1 p j r ie j +(m − 1)(∑ n j=1 max m i=1 p j r ie j )/m < ∑ m i=1 max C j@i (π) ≤ ∑ n j=1 max m i=1 p j r ie j .
That is, ∑ n j=1 max m i=1 p j r ie j /m + max m i=1 max n j=1 p j r ie j +(m − 1)(∑ n j=1 max m i=1 p j r ie j )/m < ∑ n j=1 max m i=1 p j r ie j .
It implies that ∑ n j=1 max m i=1 p j r ie j + mmax m i=1 max n j=1 p j r ie j +(m − 1)(∑ n j=1 max m i=1 p j r ie j ) < m∑ n j=1 max m i=1 p j r ie j , i.e., mmax m i=1 max n j=1 p j r ie j < 0.
It is a contradiction. The proof is complete.

Rule A1.
For an optimal schedule π * , each developer's maximum completion time is less than or equal to ∑ n j=1 max m i=1 p j r ie j /m + max m i=1 max n j=1 p j r ie j .
Proof. We prove it by contradiction. Let developer a be a developer in an optimal schedule π * whose maximum completion time C j @a (π * ) is larger than ∑ n j=1 max m i=1 p j r ie j /m+ max m i=1 max n j=1 p j r ie j , where job j is the last job assigned to developer a in this optimal schedule π * . By Lemma 1, there exists another developer b whose maximum completion time is less than (∑ n j=1 max m i=1 p j r ie j )/m. Now we check if the gap between the maximum completion time of developer a and that of developer b can accommodate job j , and it will achieve an earlier completion time (i.e., less tardiness). Let C j @a (π * ) be ∑ n j=1 max m i=1 p j r ie j /m + max m i=1 max n j=1 p j r ie j + ε, where ε > 0. Then, we have (∑ n j=1 max m i=1 p j r ie j /m + max m i=1 max n j=1 p j r ie j + ε−(∑ n j=1 max m i=1 p j r ie j )/m+ p j r be j )=max m i=1 max n j=1 p j r ie j + ε − p j r be j >ε > 0. That is, we can move the last job j from developer a to developer b and achieve an earlier completion time, i.e., less tardiness for job j . It contradicts the assumption that π * is an optimal schedule. The proof is complete.
Lemma A2. For a given job type x, the sum of the last k real developers' throughputs (i.e., ∑ m a=m−k+1 1/r ax ) is equivalent to that of the k virtual ones' throughputs (i.e., k/ k r x ).
Proof. Since the processing difficulty ratio of developer m − k + 1 is r m−k+1,x , he/she takes r m−k+1,x days to process a unit job (e.g., p j = 1) of type x. That is, for job type x, his/her daily amount of work is 1/r m−k+1,x . Similarly, for each developer a, his/her daily amount of work is 1/r ax for a = m − k + 2, m − k + 3, . . . , m. Then, for the k real developers, their daily amount of work is ∑ m a=m−k+1 1/r ax . On the other hand, since the processing difficulty ratio of each virtual developer is k r x , his/her daily amount of work is 1/ k r x . Thus, their total daily amount of work is k/ k r x . We prove the property by showing k/ k r x = ∑ m a=m−k+1 1/r ax . We have k/ k r x = k(k/(1/r m−k+1,x + 1/r m−k+2,x + . . . + 1/r mx )) −1 = 1/r m−k+1,x + 1/r m−k+2,x + . . . + 1/r mx = ∑ m a=m−k+1 1/r ax .
The proof is complete.
Lemma A3. For a given job type x, the sum of the last k real developers' throughputs (i.e., ∑ m a=m−k+1 1/r ax ) is equivalent to that of the substitute's throughput (i.e.,1/ k r x ).
Proof. Since the processing difficulty ratio of the substitute is k r x , his/her daily amount of work is 1/ k r x . Then, we have 1/ k r x = 1/( k r x /k) = k/ k r x .
Lemma A4. The throughput of the last k real developers is less than or equal to 1/ k r min .
Proof. For the same type of jobs, by Lemma 3, the throughputs of the k real developers and the substitute are the same. That is, if all the remaining jobs belong to job type 1, the throughput of the substitute is k r 1 . Similarly, the throughput of the substitute is k r 2 if all the jobs belong to job type 2, and his/her throughput is k r 3 if all the jobs belong to job type 3. Clearly, the maximal throughput is 1/ k r min and the minimal throughput is 1/ k r max . In the real world, however, it is rare that all the jobs belong to the same job type. Consequently, the throughput of the substitute is in [1/ k r max ,1/ k r min ] if each of the jobs is of a different type. Namely, the throughput of the last k developers in the real world can be as large as 1/ k r min only. The proof is complete.
Proof. We prove it by contradiction and suppose f (π * ) < LB(π). Let the average processing difficulty ratio for the optimal schedule π * be r * , where m r min ≤ r * ≤ m r max . Since the optimal objective cost is lower than that of the proposed lower bound, the actually optimal throughput must be larger than the throughput of the substitute. That is, we have 1/r * > 1/ m r min . On the other hand, note that m r min = min{ m r 1 , m r 2 , m r 3 }. Then, by Lemma 4, we have 1/ m r min = 1/min{ m r 1 , m r 2 , m r 3 } ≤ 1/r * . (sin ce m r min ≤ r * ≤ m r max ).
It contradicts that 1/r * > 1/ m r min . The proof is complete.