Genetic Optimization of Energy- and Failure-Aware Continuous Production Scheduling in Pasta Manufacturing

Energy and failure are separately managed in scheduling problems despite the commonalities between these optimization problems. In this paper, an energy- and failure-aware continuous production scheduling problem (EFACPS) at the unit process level is investigated, starting from the construction of a centralized combinatorial optimization model combining energy saving and failure reduction. Traditional deterministic scheduling methods are difficult to rapidly acquire an optimal or near-optimal schedule in the face of frequent machine failures. An improved genetic algorithm (IGA) using a customized microbial genetic evolution strategy is proposed to solve the EFACPS problem. The IGA is integrated with three features: Memory search, problem-based randomization, and result evaluation. Based on real production cases from Soubry N.V., a large pasta manufacturer in Belgium, Monte Carlo simulations (MCS) are carried out to compare the performance of IGA with a conventional genetic algorithm (CGA) and a baseline random choice algorithm (RCA). Simulation results demonstrate a good performance of IGA and the feasibility to apply it to EFACPS problems. Large-scale experiments are further conducted to validate the effectiveness of IGA.


Introduction
Energy modeling for fabrication processes and energy-aware production scheduling are fundamental issues for energy management in manufacturing systems. The former is in the scope of energy efficiency (EE) [1], investigating opportunities for energy saving in a production process. The latter is in the field of demand response (DR) [2], taking into account volatile energy prices and the power profiles of machines. In literature, energy consumption is regularly tackled as one objective of scheduling, engaged in the optimization model with other objectives, including makespan [3], reliability [4], tardiness [5], or labor cost [6]. Especially in continuous manufacturing systems, such as steel-making [7] and textile dyeing [8], although these processes are extremely energy-sensitive, energy saving has a limited contribution to the optimization compared to other objectives.
Certainly, energy management cannot be ignored for energy-sensitive continuous production scheduling. Among other objectives, reliability is one of the most difficult to achieve. A provided schedule is often faced with uncertainties in actual production, preventing it from being executed as planned, where failure uncertainty is considered the most significant uncertainty in production scheduling [9]. However, uncertainties are usually not taken into consideration during the execution of schedules [10], since the reliability of the environment is hard to measure. Stochastic modeling Gong et al. [15] formulated a mixed-integer linear programming (MILP) mathematical model for energy-aware production scheduling on a single machine, where a finite state machine (FSM) was used for energy modeling of different machine states. A conventional genetic algorithm was implemented to give an energy-efficient solution even at the presence of stochasticity, verified by empirical data from a grinding machine. For 35 jobs in 7 days, the GA took 2593 s to search for a near-optimal schedule. The performance of the algorithm was intended to improve. Jaclason et al. [16] presented a multiobjective nonlinear programming model solved with the nondominated sorted genetic algorithm (NSGA-II), aiming to schedule the use of home appliances based on the price of electricity in real-time (RTP). Statistical analysis indicated that NSGA-II has a better performance than a random GA. Details of the adaptation of the algorithm to the actual problem were not mentioned. István et al. [17] investigated the design of robust production scheduling proactively guaranteeing the energy consumption limit with uncertainty scenarios. Two exact (branch-and-bound and logic-based benders decomposition) and one heuristic algorithm (tabu search) were used to find an optimal permutation of given operations. Afterwards, a pseudo-polynomial algorithm was proposed to find the optimal robust schedule for that permutation. The master problem as a MILP model was solved by an existing optimizer, Gurobi. Guo et al. [5] addressed a flow shop scheduling problem minimizing energy consumption and tardiness penalties with the constraints of machine-state-dependent setup time. Fuzzy numbers were used to determine the impact of uncertainty. A GA with better performance than random GA was introduced, but the genetic operations were problem-specified and hard to adapt to other scenarios. Guillaume et al. [3] studied energy-aware scheduling of task execution on multiprocessors. The NP-hardness of the problem was proven; afterwards, possible solutions were discussed. Multiple heuristics were examined to get a near-optimal schedule, where the best heuristic was defined as the one with the minimum value of the objective function. If the scenario changes, the proposed heuristic might not remain efficient.
Research on energy-aware scheduling has provided multiple modeling techniques and algorithms for relatively efficient solutions. Further reviews of failure-related scheduling attempts to discover commonalities in problem modeling and algorithm design are needed.

Failure-Related Scheduling
Stochastic modeling is used in many studies by assuming or estimating a probability distribution of uncertainty during production periods [18]. These methods highly depend on historical records to fit the distribution. Robust scheduling does not have these kinds of limits, since no probability distribution of uncertainty is needed [19]. Instead of modeling uncertainties using stochastic approaches, it is designed to "absorb" uncertainty [20].
Li et al. [21] investigated a workload scheduling problem in cloud data centers. Reliability indexes were defined and labeled using worst-fit and best-fit strategies; afterwards, algorithms were designed to make the best use of the most reliable and powerful servers in data centers. The proposed method benefited from logs of data centers, while in other scenarios, detailed records of failure are limited. Jiang et al. [22] studied production scheduling with uncertain processing times for the steel-making continuous casing problem. An estimation distribution algorithm (EDA) using variate processing time as the uncertainty factor was proposed, also firmly driven by records of processing times. Wang et al. [23] worked on production scheduling of precast components in the face of uncertain workflow. The discrete event simulation (DES) was used to evaluate the feasible options using genetic algorithms. Uncertain processing time and complex resource constraints were taken into account in simulations. A trade-off was achieved between on-time delivery and minimum production cost. The proposed approach was implemented in the simulation software ARENA on a commercial optimization engine. Lu et al. [24] assumed a machine breakdown following the Weibull failure function in a maintenance planning production scheduling problem. A bi-objective genetic algorithm was used for saving energy and meeting deadlines, with preventive maintenance integrated into production orders. However, Weibull distribution is not always an ideal model for failures in other scenarios. Guo et al. [25] dealt with a multiobjective production scheduling problem at the factory level, where uncertainties were described as continuous or discrete random variables. Factors including completion time and start time of production processes were derived using probability theory. A linear model based on the concept of satisfactory level was used to represent the impact of uncertainty. Stochastic optimization with such a basic model is not sufficient for complex systems. Drwal et al. [26] provided a polynomial time optimization algorithm for the problem of scheduling jobs with uncertain completion due-dates on a single machine. Jobs were supposed to have equal weights and uncertain due-date intervals were randomly generated. For more general problems with different weights of jobs, the research conjectured the problem to be NP-hard and presented heuristics algorithms as possible approaches. Shown in experimental results, computational complexity increased significantly in some cases. Ghezail et al. [27] introduced graphical representations of scheduling solutions to model the robustness of the schedule and the impacts of unexpected disruptions. This method avoided restrictive quantitative approaches used for robust scheduling, based on an assumption that the initial order of jobs was always respected.
In literature, there were many differences between energy and failure modeling. Simulation methods were frequently used to evaluate customized representations of failure uncertainty. Similar to energy-aware scheduling, failure-related scheduling problems were regularly deduced as NP-hard, where heuristics were habitually applied.
To sum up, the scheduling problem has always been formulated in the abstract as 'find from a Set S of candidate schedules a subset T that satisfies some criterion C and minimizes an objective function f '. Heuristic methods are commonly applied to solve this combinatorial optimization problem [28].

Problem Formulation
The EFACPS problem studied in this paper was formulated by investigating three subproblems, including ordinary production scheduling, energy-aware scheduling, and failure-aware scheduling. Afterwards, a centralized combinatorial optimization model is proposed.

Notation
The description of parameters used in this section is shown in Table 1. Table 1. Parameters used for problem formulation.

Parameter Description
J set of waiting jobs with ID {1, 2, . . . , N} j i job with index i N number of waiting jobs P set of product types M number of possible product types p i product type of j i q i objective quantity of j i v p unit production speed of product type p, where p ∈ P v p i unit production speed of j i , whose product type is p i t i production duration of j i T st i start time of j i T ed i end time of j i S i set of processing stages of j i T S i set of durations of stages of j i t S ki duration of kth stage of j i P s power consumption of stage s E i energy consumption of processing j i C E i energy cost of processing j i D i vector of energy price vector of failure rate after maintenance b k failure rate of kth hour after maintenance R i vector of failure rate after raw material charged on the machine P Fi probability of failure occurrence during production of j i u p i unit raw material cost of product type p i C F i failure cost of j i

Production Scheduling
Continuous production scheduling on a single machine is defined with the following constraints: • The machine is kept busy if there are remaining jobs to finish.

•
The machine has a product-type-dependent production speed.

•
The process of a job starts when it is charged on the machine and ends when it is unloaded.

•
A new job is charged to the machine immediately after the end of the previous job.

•
Each job has a specific product type and a target quantity.

•
Preemption is not allowed, since changeover of jobs causes excessive waste.
The scheduling model is constructed from the fundamental objects of the problem. Waiting jobs are denoted using set J = {j 1 , j 2 , . . . , j i , . . . , j N }, where N is the total number. A job j i has two attributes: Product type p i and objective quantity q i . P = {p 1 , p 2 , . . . , p M } is the set of possible product types. The production speed of a product with type p i is defined as v p i , where p i ∈ P. Before planning a schedule, all aforementioned parameters (J, P, and p i , v p i , q i of each job j i ) are available and remain constant during production. Such a problem is classified as an offline scheduling problem [29].
The production duration of job j i is denoted as t i and calculated using Equation (1). The start timestamp and end timestamp of j i are defined using T st i and T ed i , respectively.
The following time constraints make sure that preemption is prohibited and jobs are executed respecting the scheduled sequence:

Energy-Aware Scheduling
Either energy or failure management in actual production is extremely complicated due to machine and product characteristics [9]. The energy charging policy used in this study is real-time pricing (RTP) with hourly changed electricity price [6].
A job j i is processed on the machine with four stages: Preparing, Preprocessing, Working and Discharging, denoted as S = {s 1 , s 2 , s 3 , s 4 }. The durations of each stage of job j i are represented using the set T s i = {t s 1i , t s 2i , t s 3i , t s 4i } with a constraint in Equation (4). The machine energy consumption E i during the production of j i is calculated using Equation (5), where P s is the power consumption of stage s. ∀t ∈ T s i , t > 0, i ∈ [1, 2, ..., N] (4) With RTP, the energy cost C E i to process the job j i can be derived. First, we designed a column vector D i of size N D i to restore useful price information during the production period. Afterwards, E i was discretized into a line vector of size N D i . Each element e ij represents energy consumption in the time period between T i and T j . This process is illustrated in Figure 1 and Equations (6)- (9). In Figure 1, the head of j i has the energy cost of d T st i , starting from T st i , ending at T st i + 1. The tail of j i has the energy cost of d T ed i , starting from T ed i , ending at T ed i . The body of j i has costs [d T st i +1 , . . . , d T ed i −1 ], starting from T st i + 1, ending at T ed i .

Failure-Aware Scheduling
The taxonomy of failure awareness in this paper was inspired by Reference [30], which classifies maintenances by their influences on health states of the machine. The set H = {h 1 , h 2 , . . . , h F } describes machine health states in ascending order of failure rate: h 0 is the initial health state and h F is the failure state. There are two kinds of possible maintenance actions in the system. M Rr represents real-time response maintenances performed by workshop operators. These actions are carried out during the processing of a job j i , making the machine quickly recover from the blockage but slowly fall into a worse health state. M Re represents replacement maintenance actions, such as replacing vulnerable parts or repairing machine components, after which the machine health state will be restored to a better health state. M Re are immediately performed after machine failure or are planned on fixed days.
Before modeling the failure uncertainty, background information of the EFACPS problem needs to be clarified: (1) Raw materials of a job will be lost if machine failure happens after the job charged on the machine; (2) maintenances are organized on fixed days on the production line; (3) compared to the cost of wasted raw materials, the cost of maintenances is negligible.
A schedule π is denoted as π = {W 1 , M 1 , W 2 , M 2 , . . . , M L , W L+1 }, where M i is a maintenance action of type M Re on a fixed date, W i is a batch of jobs between two maintenances, and L is the number of maintenances. If a maintenance of type M Re is planned to the time when the machine is processing job j i , it is considered to finish before T ed i , without postponing the next job. The duration of M Re is neglected under the aforementioned assumptions. M Rr is indirectly used for the problem to distinguish maintenance types.
From the sequence of maintenance actions, the probability of machine failure at a given time is derived. Failure rate r(t) is considered a random variable changing over time t, influenced by the effect of maintenance and wear in the machine, satisfying the memoryless property that it is based solely on adjacent maintenances [31]. r(t) is presented in Equation (10), where T pm i and T pm j are the time for two adjacent maintenances M i and M j . f 1 (t) is the average failure rate of time t after maintenance, whereas f 2 (t) is the average failure rate of time t before maintenance.
Suppose there are k hours between two adjacent maintenances. A vector I of size k is provided to estimate the influence of a maintenance, shown in Equation (11), where b j represents machine failure rate of the jth hour after the first maintenance.
The aforementioned failure model reduces both energy and failure modeling to a similar optimization model. Knowing the hourly dependent failure rate, the failure cost (wasted raw material cost) C F i to process a job j i can be estimated using a close way of calculating C E i . A vector R i of size N R i is defined in Equation (12) to represent the failure rate of each hour after raw material is charged on the machine. The calculation of B and retrieval of R i from B is explained in Section 5.1. If machine failure occurs in the stage Preparing, no failure cost is charged before the stage Preprocessing starts.
The probability of machine failure occurrence during the production of j i is defined as P Fi , calculated using Equation (13), where r i is the ith element of R i . C F i is calculated using Equation (14), where u p i is the unit raw material cost of j i with product type p i .

Optimization Model
The objective of EFACPS is to minimize the overall cost, including failure cost and energy cost. The multiobjective function is presented in Equation (15) where ϕ q is an objective and ω q is the corresponding weight. The scheduler could set ω q to 0 in case the related objective ϕ q needs to be ignored.
We define the optimization model for the problem in Equation (16), subject to Equations (1)- (14).

Method Description
The proposed model has the NP-hardness property according to the following inference: Given a candidate schedule π, no polynomial time verification algorithm is found to accept or reject π as a solution. For the problem of size N, its solution domain has the size of N!, since any random permutation of a waiting job sequence leads to a reachable cost. Although the overall cost of π can be calculated in polynomial time using the method introduced in this paper, we could not determine its position in the solution domain. Therefore, the model is not in the class NP and is inherently NP-hard.
A genetic algorithm (GA) is suitable for searching for a solution of an NP-hard problem because of its adaptive global optimization ability [32]. Conventional GA adopts artificial evolution to a population of individuals. Properties are inherited from parents but also altered and mutated. Individuals are selected for a new generation by the desired goal. The provided IGA in this paper extends CGA with the following features, which accelerate the convergence speed and provide an easier ascent towards a global optimum: (1) Memory [33] is introduced to prevent duplicate searches; (2) a customized revolution strategy presented in the microbial genetic algorithm (MGA) [14] is adopted; (3) problem-related random techniques are applied to prevent the algorithm being trapped in local optima; (4) new individuals are evaluated to ensure a significant difference from their parents.
The procedure of the algorithm is presented in Figure 2 and discussed below.

Initialization
Initial values are starting points in the search space and are highly engaged in influencing the performance of GA. These values also need to meet the constraints of the model. The population size n g is another important problem-based tuning parameter, since there is a threshold between the requirement of computational resources (time and space) in each iteration and the global convergence speed of the algorithm. For EFACPS of N waiting jobs, an individual is a candidate schedule encoded with indexes of jobs. A randomly generated vector I of size N is provided to represent an individual, whose elements are indexes of jobs, implying their execution order.
Randomness tests and resampling techniques could also be performed to ensure the randomness of initial values, which is not the topic of this paper.

Elitism Selection
The CGA stochastically selects more fitted individuals from the current generation. Afterwards, genetic operations (crossover and mutation) are evaluated on these individuals to obtain a new generation. The fitness value is used as a condition for individual selection. The elitism of the current generation is altered by genetic operations.
For EFACPS, the fitness value of an individual is the overall cost of such a candidate schedule, calculated using Equation (16). Since elitism is not reserved in each iteration, the application of CGA (see Algorithm 1) to the problem can lead to no converged evolutions in some iteration. One possible solution is using a buffer to trace the elitism of each iteration as a candidate for global optima. This approach is also used in the RCA (see Algorithm 2). Compared to the provided IGA (see Algorithm 3), such an operation keeps the global convergence but cannot guarantee the local convergence in each iteration. The selection strategy used in the IGA is based on the notation of winner and loser. Two individuals are randomly chosen from the current generation and sorted by their fitness values, where the winner has a smaller total cost. Afterwards, the winner is chosen as the elitism and kept identical in the next generation, while the loser is modified with the following genetic operations. This process is repeated several times, and each time two children are generated: one is the elitism (same as the winner) and the other is the altered loser.

Genetic Operators
The crossover and mutation processes are explained in Figure 3. With a randomly generated mask, the two selected chromosomes (winner and loser) generate a new child. Afterwards, a swap is performed on two randomly chosen points of the generated child.  The hamming distance between the loser and the generated child is calculated, which is the number of positions at which the corresponding symbols are different [34]. The aforementioned genetic operations will re-execute if the distance is small, ensuring considerable difference between the loser and the child. The algorithm also examines the existence of the generated child in the memory space. If the answer is positive, the child will be dropped and generated again to avoid a repeated search. At the end of each iteration, individuals of the current generation, represented using vectors, are saved to a memory space of fixed size. These items are inserted and removed according to the first-in, first-out (FIFO) principle.

Case Study
Pasta is the 331st most traded and the 1069th most complex merchandise according to the product complexity index (PCI) [35]. In 2014, 14.3 million tons of pasta were produced and traded worldwide, worth 8.4 billion US dollars [36]. Pasta production is both energy-sensitive and quality-focused [37], including four major stages: (1) In the mixing stage, wheat semolina or flour is mixed with water and optional ingredients (egg, salt, vegetable juice, etc.). The mixture is then put through a vacuum dough mixer to produce a homogeneous mass without air bubbles; (2) in the extrusion stage, the extruder pushes the dough through dies to form the shape. Blades of trimmers cut the dough in desired length; (3) in the drying stage, pasta is dried to a desired moisture, giving it a firm shape and a long storage life; (4) in the packaging stage, pasta is packaged into bags or boxes which are ready to transport.
Disruptions in the aforementioned stages cost huge loss of semifinished products. For instance, blockage in the extrusion stage for a relatively long time makes the dough hard to shape or even spoiled. The studied cases in this paper are derived from Soubry N.V., a large pasta manufacturer in Belgium. With the constantly upgrading IoT (Internet of Things) systems, such disturbances in the continuous production process can be better measured and supervised. Empirical data concerning energy consumption and failure measurement are provided by intervention of wireless sensors.

Parameter Resolution
This section presents how to retrieve parameters of the EFACPS model from empirical data saved by the manufacturing execution system (MES) used in the company. MES provides information that helps decision makers to understand the current situation of the shop floor [38]. After preprocessing for intended uses in production scheduling, the following records were examined:

•
Order information records (OIR) contain general attributes of orders processed in the job shop, including planning time, objective quantity, product type, and raw material cost.

•
Order production records (OPR) contain processing details of orders, including effective time and production speed.

•
Machine state records (MSR) contain runtime and downtime periods of the machine in the processing window of each order.

•
Maintenance event records (MER) contain dates of maintenance events.
In addition, RTP data were used as the energy price in the experiment, taken from Belpex, the electricity spot market in Belgium [39]. Most of the parameters used in the problem formulation can be found in the aforementioned records, see Table 2. Other parameters which cannot be directly retrieved in those records were divided into two groups: Parameters (S i , S ik , T S i , t S ik , P s ) about processing stages and parameters (B, b k , R i ) about machine failure rates. The rest of the parameters (E i , C E i , P Fi , C F i ) are objective variables. Table 2. Parameters in data records.

Record Parameters
MES has no records of stage information, making it difficult to know stage-related parameters. In the case study, according to the experience of job shop operators, we made an assumption that the average duration of the preparing stage t S i1 is 1 h, after which raw materials are charged to the machine. The energy consumption profile of the investigated machine is estimated by machine power parameters and by consumed energy measured by a power meter. Each type of product has a specific power profile. Until now, all required parameters for Equations (6)-(9) to calculate energy cost C E i are available.
The function of the failure rate between two adjacent maintenances r(t) is calculated using the average effect of maintenances, where the influences of maintenances f (t) are accumulated and normalized. B is a discretized representation of r(t). The procedure of retrieving R i is explained as follows: In Figure 4, r(t) is the function of failure rate over time, b k = r(T pm + k). Knowing T st i , T ed i , t S i1 (by our assumption t S i1 = 1) and the time of the previous maintenance T pm , R i can be detected.
With all the required parameters known from the abovementioned procedure, Equations (12)-(14) are used for calculating failure cost. For the objective function in Equation (16), both ω 1 and ω 2 are set to 1.

Simulation Settings
Empirical MES records are also used as test cases of EFACPS. Before performing the proposed IGA, various settings of the experiment were made: 1. The concerned waiting job set in the experiment is a subset of all finished jobs in MES records.
Derived from a fixed date range (from 2016-11-03 06:00:00 to 2016-11-07 17:00:00), 8 jobs were investigated in the case study. Details are provided in Table 3. 2. Corresponding Belpex RTP data have the same date range as the concerned waiting jobs. 3. The time step in the schedule is one second. 4. Maintenances are fixed on Saturdays. 5. Randomly generated data are used instead of real production data for some job characteristics (raw material cost and power profile) according to the confidentiality agreement. In Table 3, job j i with product type p i has unit raw material cost u p i following the distribution U(0.05, 0.10) B C/kg (from data source [35,36]) and power profile g p i following the distribution U(0.08, 0.15) MW (from data source [40]).
We used the procedures presented in Section 5.1 to calculate the average effect of maintenance f (t). The result in reality is shown in Figure 5, based on 90 maintenance records during 1 year. Figure 5. Failure rate as a function of time before and after maintenance (0d = maintenance day, −x d = x days before maintenance, +xd = x days after maintenance).
In Figure 5, the failure rate f (t) consists of technical failures and small breakdowns solved by operators (M Rr ). Replacement maintenance (M Re ) is performed on day 0. As mentioned in Section 3.1.3, M Rr can not prevent the machine slowly falling into a worse health state; instead, M Re can restore the machine to a better health state. Therefore f (t) slowly increases during the days before maintenance (negative days) and sharply decreases on the day after maintenance (1 d), then keeps growing in the following days (other positive days).
With the average effect of maintenance f (t), we can calculate the failure rate of each day. For instance, 2016-11-04 is Thursday, which is 5 days after and 2 days before a maintenance day (Saturday); therefore, its failure rate r = max[ f 1 (5), f 2 (2)].
Parameters for IGA are tuned as follows: The population size α is set as 8; each generation has 8 candidate solutions. The maximal generation size n 2 is 200; therefore, the IGA will iterate the evolution for 200 generations. In each iteration, two chromosomes are selected and sorted by their fitness. The winner is retained to the next generation as the elitism, the loser is modified by genetic operations. The crossover rate (p 1 = 60%) and mutation rate (p 2 = 80%) are fixed. These high rates indicate that search points spread out among the solution space [41].

Results and Discussions
The near-optimal schedule found by IGA was compared with the original schedule and other candidate schedules: The single objective schedule and shortest job first (SJF) schedule. Figure 6 visualizes the original schedule and the candidate schedules with energy price and failure rate in the selected date range. Schedules are presented in continuous bar plots with different colors. Real-time electricity price changes hourly but remains the same within one hour, marked in blue. The failure rate is derived from previous sections, marked in green. The detailed comparison results of investigated schedules are provided in Table 4.
All candidate schedules start at 2016-11-03 09:30:51, which is the start time of job 510 in the original schedule C1. The near-optimal schedule C2 found by the IGA decreases both the energy cost and failure cost, saving 23.24% of total cost. The near-optimal schedule is further compared with other schedules using different policies. The candidate schedule C3 uses a classical policy, shortest job first (SJF), increasing 4.66% of the total cost than C1. Schedules considering a single objective are also inspected: The candidate schedule C4 minimizes the energy cost, saving 5.09% of energy cost and 5.77% of total cost. The candidate schedule C5 minimizes failure cost, saving 24.90% of failure cost and 23.21% of total cost. Among all candidate schedules, C2 has the lowest total cost.
A benefit of our proposed model is the flexibility to apply the schedules in actual production. Occurrences of short spare time (<1 h) on the machine do not conflict with our assumption that the machine is kept busy when there are remaining jobs. A slight left or right shift of a job will not affect the cost of a schedule. For instance, in C1, job 511 is scheduled to start at   The scheduler can decide the weight of each objective in case there are special requirements or constraints. Huge consumers of electricity always sign contracts with providers, negotiating an agreement of low price for a certain amount of electricity [42]. Suppose the company has to pay extra fees for overuse of electricity: The scheduler adjusts ω 1 to 15 without changing other parameters. The comparison results of candidate schedules after such a modification are provided in Table 5.
The provided IGA obtains the best result (saving 13.14% of total cost) among all candidate schedules. Experimental results also indicate that the studied EFACPS case of 8 jobs in this section has fewer opportunities for energy saving than for failure reduction. The failure model contributes more to the optimization than the energy model for this specific case. Before applying the IGA to actual EFACPS problems, the performance of the algorithm was investigated.

Convergence Analysis
Since random sampling is used in the initialization step (see Section 4.1) of IGA, CGA, and RCA, a Monte Carlo simulation (MCS) is suitable to analyze the result and the performance of algorithms [43]. The studied case is an EFACPS of size 8 (n 1 = 8), having 8! = 40,320 possible candidate schedules. The MCS is set to run 50 times. Each simulation evaluates 207 candidate schedules, with 10,350 schedules in total. For a reasonable comparison, the search space of CGA and RCA should have the same size. Therefore, CGA has same paremeters (n 1 , n 2 , α, p 1 , p 2 ) as IGA. Each simulation of RCA runs 207 iterations for random choice. Figure 7 depicts the IGA search trend of MCS. Figure 7a contains 50 curves; each curve corresponds to one simulation. Despite the consensus that GA produces a near-optimal solution [44], MCS with sufficient trails can provide an optimal solution. Shown in Figure 7b, the algorithm quickly converges to relatively good solutions in the early (<25) generations, with the cost decreasing from 9235.86 B C to fewer than 7250 B C. From 25 to 175 generations, the algorithm steadily converges to the optimal (or near-optimal) solutions. After 175 generations, the algorithm remains stable.  Due to the introduced features of memory, distance evaluation, and random technique, coupled with the conspicuously applied evolution strategy, IGA converges significantly faster than CGA, whose search trend is presented below.
In Figure 8, CGA needs at least 50 generations to obtain a relatively good solution (fewer than 7250 B C), which is 25 generations slower than IGA. Throughout some simulations, CGA does not even converge in 200 generations. Duplicate search and trapping into local optima decline the convergence speed.
The search trend of RCA is also provided in Figure 9. To get a relatively good solution (fewer than 7250 B C), CGA needs at least 75 iterations. Therefore, it has the lowest convergence speed among the three algorithms. Nonconvergence exists in some simulations as well.   An important discussion point is the configuration of parameters when applying IGA. Different settings of parameters lead to distinct search trends, but a good parameter setting can make the algorithm converge faster. Because of random initialization, the same settings can also result in different search trends. In our experiment, this was avoided by giving a fixed random seed.
Regardless of other parameters, a larger generation (iteration) size always provides a better result. However, the scheduler is encouraged to view the search trend and apply other parameter tuning techniques. For instance, if the algorithm remains stable for more than 50 generations, it is considered already converged.

Performance Evaluation
The provided algorithms (IGA, CGA, and RCA) are randomized optimization algorithms with arbitrary input, output, and performance. The graphical representations of 50 simulations in the previous section facilitate the understanding of how the algorithms behave when applied to the studied case of 8 jobs. The convergence speed of IGA (25 generations) is 3 times faster than that of RCA (75 generations) and 2 times faster than that of CGA (50 generations). In this section, statistical analysis was performed for further comparison. In Figure 10, algorithms and statistical indicators are compared and distinguished using different colors and shapes. The averages are marked using triangles. In each iteration, RCA has the highest average cost, followed by CGA, while IGA has the lowest average cost. Therefore, IGA has the best average performance in the three algorithms. Both CGA and RCA have a decreasing average cost with the growth of iteration numbers, indicating the feasibility of these algorithms. The maxima are marked using circles, reflecting the worst performance of algorithms. All algorithms have a decreasing maximum cost when the iteration number increases. In each iteration, RCA has the highest maximum cost, whereas CGA and IGA have a relatively low maximum cost. The minima are marked using squares, representing the best performance of algorithms. All algorithms have the same minimum cost after 50 iterations. Numbers of simulations converged to the maximum cost, the minimum cost, and other cost values at iteration 200 are also provided in Table 6: For IGA, 17 in 50 simulations converge to the minimum. For CGA, this number reduces to 4. For RCA, only 1 simulation converges to the minimum. The standard deviations (SD) of costs are calculated in Figure 11. All algorithms have decreasing SDs as the iteration number increases. In each iteration, RCA has the largest SD, followed by CGA. IGA has a relatively small SD, indicating that the simulation results of IGA do not have a large variation or dispersion. The SD of CGA decreases in early generations (<100) and remains steady afterwards.
The convergence summary and the statistical indicators from Table 6 support our inference that IGA has the best performance among the three provided algorithms: In 50 experimental results of MCS, IGA has the lowest average cost and a relatively small SD. Furthermore, IGA are more easily to converge to the minimum than the other two algorithms.  The coverage ratio [45] for the simulation results of RCA, CGA, and the IGA was also examined. Defined in Equation (17), the coverage ratio C(A, B) represents the number of points in set B dominated by set A over the total number of points.
C(A, B) = |{x ∈ B | ∃y ∈ A : y dominates x}| |B| (17) In our case, y dominates x is defined if point y has lower or equal cost than point x. The summary of the coverage ratio between simulation results is presented in Table 7; a higher ratio of C(A, B) implies a better performance of A over B in the perspective of result coverage. C(IGA, CGA) and C(IGA, RCA) remains 1 in all iterations, indicating that IGA can always find an equal or better result than the other two algorithms. C(CGA, IGA) and C(RCA, IGA) vary in different generations. C(CGA, RCA) and C(RCA, CGA) change with the growth of iteration numbers, but in the end, C(CGA, RCA) converges to 1. Therefore, CGA can find an equal or better result than RCA when the iteration number increases.

Complexity Analysis
The detailed procedure of the provided IGA is explained in Algorithm 3, where complexity analysis is conducted on each stage. We investigated the algorithm in the worst case, where each stage requires the longest time and largest space. Afterwards, the asymptotic upper bound [46] of time and space complexity of the algorithm was given.
The provided IGA was implemented in Python using specially designed data structures to improve the efficiency as possible. In the initialization stage, the time consumption for random generation is O(1), and the corresponding space consumption is O(α * n 1 ). In the elitism selection stage, the time for creating sub_pop is O(1), for fitness value sorting is O (2), and the corresponding extra space requirement is O(2 * n 1 ). In the crossover stage, the loser is replaced with DNAs from the original chromosome and the winner, where the time requirement is O(n 1 ) with no more space requirement. In the mutation stage, random choice and swap of positions demand O(1) time and no extra space. Distance evaluation has the time complexity of O(n 1 ) and space complexity of O(2 * n 1 ). In the memory search stage, the time and space requirement is O(β * α), where β is the size indicator of memory, a self-defined constant depending on the workstation running the algorithm. Finally, retrieving the best cost P and candidate schedule C requires O(n 1 ) time with no extra space.
To sum up, the time complexity of IGA is O(n 1 * n 2 ) because of the outer loop of n 2 iterations. Therefore, the corresponding asymptotic upper bound of time complexity of IGA is O(n 2 ). The space complexity is O(n).

Stress Test
The efficiency of the provided IGA is further investigated in this section on a larger problem. On a normal PC (i7 CPU, 16G RAM) released in 2017, for n 1 = 1122 (number of jobs from 2016-01-19 14:00:00 to 2017-11-15 00:00:00, records of 2 years), n 2 = 200, and β = 5, the algorithm requires 77.48 s to obtain a near-optimal schedule with the corresponding cost; each iteration takes 0.39s. The original schedule has the energy cost of 59,956.10 B C and the failure cost of 1,228,990.41 B C, in total 1,288,946.51 B C.
The performance of IGA, CGA, and RCA on such large problem scale was also analyzed using MCS. The number of simulations was set as 50; therefore, each algorithm was executed for about one hour to search for a near-optimal solution. Certainly for large-scale problems, the limited number of simulations cannot ensure that the IGA finds an optimal solution. Schedulers are encouraged to run more simulations for a potentially better result. Same as in Section 5.3.2, the convergence summary and the statistical indicators are presented in Figure 12 and Table 8.
Experimental results showed that after one hour's execution, IGA can provide a near-optimal candidate schedule with 3.03% total cost saving in average compared to the original schedule, which is the lowest in the three provided algorithms (CGA 2.97%, RCA2.56%). Considering both Figure 12 and Table 8, IGA has a rapid convergence speed (fewer than 25 generations) to near-optimal schedules with the lowest average cost and is also more likely to converge to the schedule with the minimum cost (saving 3.17%).

Conclusions
In this paper, the energy-and failure-aware continuous production scheduling problem at the unit process level was investigated. The research put forward a coupled model of energy and failure cost and provides an improved genetic algorithm to solve it. The IGA was implemented in Python with the time complexity of O(n 2 ) and the space complexity of O(n). The algorithm was efficient to search for a near-optimal schedule with volatile energy prices and maintenance-dependent machine failure rates. Real industrial cases from a large pasta manufacturer were studied using the provided algorithms. Compared to an original schedule from empirical records (8 jobs in 5 days), the IGA provided a near-optimal schedule saving 23.24% of total cost. For another larger case (1122 jobs in 2 years), the IGA also found a near optimal solution, saving 3.17% of total cost. The results of Monto Carlo simulations indicate that IGA converges 2 times faster than CGA and 3 times faster than RCA and can always obtain better solutions within limited time and iterations.
Future research could further improve in several directions. Similar to other continuous production systems, like textile dyeing [8], steel-making [22], and construction [23], pasta manufacturing is both time-sensitive and quality-sensitive, with strict constraints of the food industry. The actual industrial environment is much more complicated than the background of the investigated EFACPS problem in this study, with special requirements of suppliers, customers, operators, machine configurations, and product status. The current study has limitations caused by the adopted assumptions in the modeling stage as a consequence of the multivariate production environment. For instance, the cost of maintenances is neglected in this study, but in reality, they are expensive. Additionally, the machine is not obliged to keep busy in continuous manufacturing, since idle periods are allowed between working periods in the face of small breaks, like the shortage of raw materials or the shift of operators.
The proposed method should be further extended to multiobjective problems and more complicated machine environments. Modeling and algorithm design should also take into consideration the availability of data sources or actively make use of sensors and controllers.
In conclusion, this paper puts forward a new perspective of energy and failure management in continuous production scheduling and provides an improved genetic algorithm as an efficient solution with better performance than conventional genetic algorithms.
Author Contributions: K.S.: Research design, simulator implementation, literature retrieval, manuscript writing; T.D.P.: Data acquisition, experiment design, research guidance; X.G.: Simulator design, research guidance; L.M.: Project guidance, direction guidance, financial support; W.J.: Project analysis, research guidance, direction guidance. All the authors have contributed to the scientific part of this work and to the writing of this article.
Funding: This research was supported by the IMEC/ELITE (Efficiency-optimized production Lines using industrial Internet of Things Enhancements) project. More information is available at the https://www.imec-int. com/en/what-we-offer/research-portfolio/elite web page of ELITE.