A Bi-Objective Approach to Minimize Makespan and Energy Consumption in Flow Shops with Peak Demand Constraint

: With the growing concern of energy shortage and environment pollution, the energy aware operation management problem has emerged as a hot topic in industrial engineering recently. An integrated model consisting of production scheduling, preventive maintenance (PM) planning, and energy controlling is established for the ﬂow shops with the PM constraint and peak demand constraint. The machine’s on / o ﬀ and the speed level selection are considered to save the energy consumption in this problem. To minimize the makespan and the total energy consumption simultaneously, a multi-objective algorithm founded on NSGA-II is designed to solve the model e ﬀ ectively. The key decision variables are coded into the chromosome, while the others are obtained heuristically using the proposed decoding method when evaluating the chromosome. Numerical experiments were conducted to validate the e ﬀ ectiveness and e ﬃ ciency by comparing the proposed algorithm and the traditional rules in manufacturing plant. The impacts of constraints on the Pareto frontier are also shown when analyzing the tradeo ﬀ between two objectives, which can be used to explicitly assess the energy consumption.


Introduction
The effectiveness of operations management is crucial for improving the profit of manufacturing company in the fierce market competition. Thus, the flow shop scheduling problem has already been investigated for many years, which is not only a hot topic of researchers but also an interested issue of practitioners in industry. The related research results can be found in the articles and review papers, such as those of Ribas et al. [1], Yenisey et al. [2], and Komaki et al. [3]. Although different kinds of assumptions and details are considered in the literature, it is still important to gather information and explore the features of manufacturing plant to fill the gap between theory and reality since the world changes so quickly in the 21st century.
According to the International Energy Outlook by the U.S. Energy Information Administration (USEIA, 2016), the total world energy consumption is projected to increase from 549 quadrillion British thermal units (Btu) in 2012 to 815 quadrillion Btu in 2040 [4]. It means that a 48% increase from 2012 to 2040 will further aggravate the situation of energy shortage and environment pollution. Thus, sustainability and green economy are attracting more and more attention from the researchers and practitioners in different fields recently. It is reported that 54% of the world energy consumption is caused by the industrial sector [5]. Furthermore, 90% of energy consumption and 84% of CO 2 emission in the industrial sector are attributed to the manufacturing activities [6]. All human beings have reached

Problem Description
In this paper, we study a flow shop composed of several machines M = {M 1 , M 2 , . . . , M m }. A set of jobs J = {J 1 , J 2 , . . . , J n } needs to be processed from M 1 to M m . Thus, each job J i includes a sequence of operations {O ij }. The basic processing time of O ij on machine M j is denoted by p 0 ij . All jobs are ready to be processed at time zero. The completion time of the last job is denoted by C max , which equals to its finish time on the last machine.
PMs need to be performed periodically for each machine to guarantee a high reliability. Maintenance time of M j equals to pt j . When a PM is performed on a machine, it cannot process the job at the same time. The reason is that operator must turn off the machine and stop the processing to execute PM. We assume that the pre-emption of operations is not allowed, i.e., the non-resumable case is considered here. In one PM period, the machine's effective working time cannot be larger than a threshold PT j . The machine's age is defined here to explain the PM constraint. The machine's age Sustainability 2020, 12, 4110 5 of 22 equals to zero at the beginning of horizon. It does not change if machine does not process jobs. The age of machine M j is increased by p 0 ij after processing the operation O ij . In addition, the machine's age becomes zero immediately after a PM. Therefore, the machine's age cannot exceed PT j at any time for each machine according to the PM constraint.
There are three states for one machine: off/idle/work. It is obvious that the power demand of machine is zero when it is off. When one machine is running and no operation is being processed on this machine, it is idle. The energy consumption per unit time of M j equals e id j when it is idle. It is straightforward that shutting down the machine is a good method to save energy when it is idle. However, when the machine is setup again, it consumes additional energy from the off state to the running state. It is assumed that energy consumption of M j caused by setup equals e st j , which is a constant. Then, the energy consumption of M j during idle time and e st j should be compared when deciding to shut it down. The setup time is very small, which is ignored in this paper. At the start of the production horizon, each machine is off. The machine needs to be turned on to process the first job and turned off after finishing all jobs. The additional on/off operations during the scheduling horizon need to be decided according to the planning of production and PMs.
When a job is being processed in one machine, the speed level of machine needs to be selected for the working state. There is a finite and discrete set of levels Level j = 1, 2, . . . , L j for machine M j . Accordingly, the speed set is v 1 j , v 2 j , . . . , v L j j for different levels and the power demand set is e 1 j , e 2 j , . . . , e L j j . The actual processing time of O ij equals to p 0 ij /v 1 j when M j is in Level 1; meanwhile, the energy consumption of M j equals e 1 j during one unit time. It is obvious that the actual processing time is shorter with a higher speed level and the power demand is larger at the same time. In addition, we assume that e 1 j p 0 ij /v 1 j < e 2 j p 0 ij /v 2 j , which means the total energy consumption of one operation is larger when the speed level is higher. It is practical in industry and also approved in the literature.
The total energy consumption during the production horizon consists of the machines' setup consumption and the energy consumption during working time and idle time. The production planning needs to be optimized for minimizing the total energy cost. Meanwhile, the peak demand constraint must be obeyed, i.e., the maximum power demand of production system must be smaller than a promised threshold D. It is common that the power utility proposes this requirement when it signs the contract with its industry customer. The peak demand is defined as the highest average kW measured in each interval of length δ (usually 15 min) during the production horizon. We assume that D is smaller than m j=1 e L j j . Otherwise, all speed combinations are feasible, which means the peak demand constraint is too loose. We assume that D is larger than m j=1 e 1 j . Otherwise, the peak demand constraint is too tight. The manufacturing plant negotiates with the power utility to increase its maximum allowed power demand.
As mentioned above, three interrelated aspects are integrated into our decision model. First, the production sequence and the speed level for each operation need to be determined. Second, the maintenance planning needs to be determined while the machine's age cannot exceed a given threshold. Third, the machines' off/on need to be decided to minimize the total energy cost with the consideration of peak demand constraint. One integrated solution for an example with three machines and five jobs is shown in Figure 1. Different jobs are presented by different colors. Black box means the machine is off during this time. Box with shadow lines means that a PM is performing during this time. The number in "{}" on the box means the speed level of this machine during this time. The length of each interval is δ. The time window of the fourth interval is [t 3 , t 4 ]. In this case, all jobs are finished before the end time line. can only be started after its finish time in M2. Thus, M3 is shutdown to save energy after finishing . Since the time length is longer than the maintenance time, it is a good opportunity to perform a PM, even though the machine's age does not reach the threshold. . Thus, the average power demand in the fourth interval equals to / , which must be smaller than the given . For the traditional flow shop scheduling problems without the consideration of energy-related objective, non-delay schedule is optimal for minimizing makespan, which is proved to be NP-hard. In our study, the integrated problem is much more complicated than the traditional problem. Considering the jobs' sequence, speed level, PMs, and machine's on/off, the searching space size is ! ∏ 2 2 . Besides, additional buffer times need to be inserted into the schedule since the peak demand constraint must be obeyed. Thus, the integrated problem is a mixed integer programming problem with discrete variables and continuous variables.

Mathematical Programming Model
Considering the makespan and the total energy consumption, a bi-objective mathematical model is established as follows. Considering the peak demand, the time window of the interval ]. Furthermore, is the number of intervals. Let be the kth operation on machine . Let be a very large constant. Let be a very small constant which is larger than zero.
Decision variables: : If job is processed at the kth position on machine with speed , it is 1; else, 0. : If there is a PM immediately before , 1; else, 0. : If there is a setup immediately before , 1; else, 0.

Auxiliary decision variables:
: If is started in the interval and there is a setup before , it is 1; else, 0. : Actual processing time of . : Actual energy consumption per unit time when processing . In machine M 1 , a PM is performed between J 3 and J 4 . Otherwise, the machine's age would be larger than the threshold after J 4 without this PM. In machine M 2 , idle time exists between J 1 and J 2 . The machine is not shut down since the idle time is very short, which results in that idle consumption being smaller than e st 2 . Correspondingly, there is one setup between J 2 and J 3 . In machine M 3 , job J 1 is started later than its finish time in M 2 since the average power demand during [t 2 , t 3 ] must be smaller than D. Furthermore, the time length is very long between J 2 and J 3 since J 3 can only be started after its finish time in M 2 . Thus, M 3 is shutdown to save energy after finishing J 2 . Since the time length is longer than the maintenance time, it is a good opportunity to perform a PM, even though the machine's age does not reach the threshold. Now, the average power demand in the 4th interval is calculated here. Let f 31 be the finish time of Thus, the average power demand in the fourth interval equals to TEC/δ, which must be smaller than the given D.
For the traditional flow shop scheduling problems without the consideration of energy-related objective, non-delay schedule is optimal for minimizing makespan, which is proved to be NP-hard. In our study, the integrated problem is much more complicated than the traditional problem. Considering the jobs' sequence, speed level, PMs, and machine's on/off, the searching space size . Besides, additional buffer times need to be inserted into the schedule since the peak demand constraint must be obeyed. Thus, the integrated problem is a mixed integer programming problem with discrete variables and continuous variables.

Mathematical Programming Model
Considering the makespan and the total energy consumption, a bi-objective mathematical model is established as follows. Considering the peak demand, the time window of the tth interval [WS t , WE t ] equals [(t − 1)δ, (t)δ]. Furthermore, T is the number of intervals.
Let O [k] j be the kth operation on machine M j . Let H be a very large constant. Let H be a very small constant which is larger than zero.
Decision variables: x i[k] jl : If job i is processed at the kth position on machine M j with speed l, it is 1; else, 0. Constraints: j are binaries; others are continuous variables (22) The objectives in Equations (1) and (2) are two kinds of considerations, which show the tradeoff between two conflict aspects. The constraint in Equation (3) ensures that one job must be located in one position of each machine and can only be processed with one speed. The constraint in Equation (4) ensures that one position of each machine can only be engaged by one job. The constraint in Equation (5) ensures that the jobs' sequences on different machines are the same. The constraints in Equations (6) and (7) specify the actual processing time and power demand of the operation O [k] j . The constraint in Equation (8) shows the relation between the start and the finish of O [k] j . The constraint in Equation (9) ensures that one job can only be started after it is finished on the last machine. The constraint in Equation (10) ensures that one machine can only process one task at any time. The constraints in Equations (11) and (12) derive the machines' age at different time. The constraint in Equation (13) is the PM period constraint. The constraint in Equation (14) means that one PM can only be processed when machine is off. The constraint in Equation (15) means that each machine needs to be turn on to start processing jobs at the beginning. The constraints in Equations (16) and (17) mean that the setup can only be located in one interval if there is one setup before one operation. The constraint in Equation (18) means that the energy consumption of machine M j during tth interval is caused by three parts: processing jobs, idle time, and setup. The constraint in Equation (19) specifies the overlap between the processing time of each job on M j and the tth interval, based on which the energy consumption can be obtained using the power demand per unit time. The constraint in Equation (20) shows that, if one setup exists in the idle time between O [k−1] j and O [k] j , then no energy consumption occurs in this idle time; otherwise, the idle-time energy consumption needs to be accounted. The constraint in Equation (21) is the peak demand constraint, which requires the average energy consumption in each interval below the threshold. The constraint in Equation (22) shows the features of decision variables.

Algorithm Designing
The mathematical model cannot be solved effectively by the commercial software, such as Cplex, Grobi, and Lingo. One reason is that the constraints in Equations (12), (19), and (20) are nonlinear. Although it is possible to get the linear formulas using the big constant method and additional 0/1 variables, too many constraints and variables would be added into the model. Besides, the flow shop scheduling problem is still very complicated even if the energy-related factors are ignored. Thus, we adopt the meta-heuristic to solve this model instead of using exact algorithms. According to the research results in [55], meta-heuristics such as genetic algorithm (GA) [56], ant colony optimization algorithm (ACO) [57], and particle swarm optimization algorithm (PSO) [58] are very effective to solve the combinatorial problems such as flow shop, job shop and open shop scheduling problems to get the near optimal solutions.
For the multi-objective optimization problem minimizing a vector of objective functions, the model can be simply formulated as Min Ob j(π) = Ob j 1 (π), Ob j 2 (π), . . . , Ob j τ (π) subjected to constraints. To define the optimal solution for the multi-objective optimization problem, we introduce the concept of Pareto firstly. Let π 1 and π 2 be two solutions for the given problem. If π 1 is worse than π 2 from the viewpoints of all objectives, then π 1 is dominated by π 2 , i.e., π 2 ≺ π 1 . If there is no solution π satisfying π ≺ π 2 , then solution π 2 is Pareto optimal. The set of all Pareto optimal solutions is denoted by the Pareto frontier.
For the single-objective problem, the aim of model-solving is to find the best solution with minimum objective value. For the multi-objective problem, the aim of model-solving is to find the Pareto frontier. For any two solutions π 1 and π 2 on the Pareto frontier, there is no dominance relation between π 1 and π 2 , i.e., π 1 is better than π 2 for some objectives and π 1 is worse than π 2 for the other Sustainability 2020, 12, 4110 9 of 22 objectives at the same time. Some researchers design a weighted sum of all objectives and use the single-objective algorithm to solve the model, which can only get one solution each time for the defined weights. Instead, we adopt the multi-objective optimization algorithm based on non-dominated sorting to search the Pareto frontier directly.

Framework
Deb et al. [59] designed a multi-objective evolutionary algorithm based on the non-dominated sorting strategy and crowding distance, which is denoted by NSGA-II. The performance of an individual in the population is evaluated by the rank and crowding distance. The basic framework of NSGA-II is similar to the traditional GA solving single objective problem. The main difference is how to get new population in the evolution. In NSGA-II, the parent population and the offsprings are merged into one pool. Then, the best individuals are selected from the pool to compose the new population. The detailed structure can be found in [59].
Here, we only provide the basic framework of NSGA-II in order to make the methodology clear and ignore the detailed sorting procedures calculating the rank and crowding distance. As shown in Figure 2, the population evolves from old generation to new generation according to the survival of the fittest. Since the individual with better fitness has a larger probability to generate offspring, the characteristic of this individual will be likely inherited by the offsprings. Thus, the performance of generation g+1 will be better than that of generation g. After evolution, the Pareto optimal solutions are obtained from the last generation g max .
Sustainability 2020, 12, x 9 of 21 Here, we only provide the basic framework of NSGA-II in order to make the methodology clear and ignore the detailed sorting procedures calculating the rank and crowding distance. As shown in Figure 2, the population evolves from old generation to new generation according to the survival of the fittest. Since the individual with better fitness has a larger probability to generate offspring, the characteristic of this individual will be likely inherited by the offsprings. Thus, the performance of generation g+1 will be better than that of generation g. After evolution, the Pareto optimal solutions are obtained from the last generation gmax.
We also provide the particular genetic operators of our algorithm in order that the readers can repeat our study. The chromosome representation. We use two parts to represent a chromosome. Part one is a string of integers, which shows the sequence of processed jobs. For example, {3, 4, 2, 5, 1} implies that job j3 is processed in the first position. Part two is the speed level selected for each operation on each machine, which is a matrix. For example, when there are three machines, {1, 2, 2, 1, 3; 2, 1, 2, 4, 1; 5, 3, 1, 5, 5} implies that the first processed operation in machine M1 is with speed Level 1, the first operation in machine M2 is with speed Level 2, and the first operation in machine M3 is with speed Level 5.
The chromosome evaluation. Based on the chromosome, the main planning is already there. We can get the actual processing time and energy consumption for each operation. Then, we need to optimize and minimizing the objectives without violating the constraints. The detailed procedure can be found in Section 4.2. After finding the optimal and , the makespan and energy We also provide the particular genetic operators of our algorithm in order that the readers can repeat our study.
The chromosome representation. We use two parts to represent a chromosome. Part one is a string of integers, which shows the sequence of processed jobs. For example, {3, 4, 2, 5, 1} implies that job j 3 is processed in the first position. Part two is the speed level selected for each operation on each machine, which is a matrix. For example, when there are three machines, {1, 2, 2, 1, 3; 2, 1, 2, 4, 1; 5, 3, 1, 5, 5} implies that the first processed operation in machine M 1 is with speed Level 1, the first operation in machine M 2 is with speed Level 2, and the first operation in machine M 3 is with speed Level 5.
The chromosome evaluation. Based on the chromosome, the main planning is already there. We can get the actual processing time and energy consumption for each operation. Then, we need to optimize y and z minimizing the objectives without violating the constraints. The detailed procedure can be found in Section 4.2. After finding the optimal y and z, the makespan and energy consumption can be obtained for the chromosome. If C max (ch 1 ) ≤ C max (ch 2 ) and TEC(ch 1 ) ≤ TEC(ch 2 ), then ch 2 is dominated by ch 1 , i.e., ch 1 has a better rank. Otherwise, two chromosomes are located on the same frontier, i.e., they have the same rank. For the chromosomes on the same frontier, the Euclidian distance between chromosomes needs to be calculated. ch 1 is better than ch 2 if they are on the same frontier and the obtained crowding distance based on Euclidian distance of ch 1 is larger.
The selection. Similar to traditional GA, the fitness value based on the performance of chromosome needs to be calculated. In this paper, the fitness value of chromosome is a combination of rank and crowding distance. The chromosome with a better fitness value is a parent with a higher probability compared with the worse ones. To reduce the selection pressure and increase the diversity of population, we use the two-size tournament method. We choose two chromosomes randomly and put the better one into the mating pool. Furthermore, all the other parents are chosen using the same method.
The crossover. We choose two parents ch 1 and ch 2 from the mating pool and perform the crossover procedure with a probability of 0.8. For the first part of chromosome, we use the order crossover method. A string of n binaries is generated stochastically. For example, a string of {1, 0, 1, 0, 0} is generated when n = 5. Firstly, the offspring's gene corresponding to the "1" of the string is copied from the parent ch 1 . Secondly, these integers are deleted from the parent ch 2 . Thirdly, the offspring's gene corresponding to the "0" of the string is copied from the parent ch 2 in sequential. For the second part of chromosome, partial crossover is adopted. For the speed level of each machine, the front of offspring is the same with the front of parent ch 1 and the rear of offspring is the same with the rear of parent ch 2 .
The mutation. We choose one offspring ch and perform the mutation procedure with a probability of 0.05. For the first part of chromosome, we use the inversion mutation method. Firstly, two genes g 1 and g 2 are selected from the string randomly. Secondly, we inverse the numbers in the range of [g 1 , g 2 ] for the chromosome. For the second part of chromosome, one gene is randomly chosen. Then, the operator changes the corresponding speed level randomly.
The population management. The population is composed of 200 individuals. In the initial population, 198 individuals are randomly created and two solutions are constructed heuristically as follows. The first solution corresponds to the low-speed mode, which means the speed levels of all machines are set to the lowest level for processing all jobs. The second solution corresponds to the high-speed mode, which means the speed levels of all machines are set to the highest level for processing all jobs. After selection, crossover, and mutation, we get 200 offsprings. Then, 200 offsprings and 200 individuals from the old population are added into one single pool. To keep good population diversity, we check the pool to find the duplicates and use the mutation method to change them. Finally, we select the 200 best individuals from the pool to compose the new population. The evolution stops when the number of iterations g reaches the maximum limit g max .

Decoding Method
For a chromosome ch, the jobs' sequence and speed level are fixed. We only need to find the optimal y and z to evaluate the performance of ch. Since the PM and machine's on/off can only be executed during idle time between two consecutive jobs, analysis about idle time needs to be done.
Considering the objective of total energy consumption, machine should be turn off when the energy consumption during idle time is larger than setup consumption. Meanwhile, PM should be performed here if the length of idle time is longer than pt j ; otherwise, PM is ignored here unless the maintenance period constraint will be violated without a PM.
When the energy consumption during idle time is smaller than setup consumption, we should keep machine on for saving total consumption. However, if a PM has to be performed here considering the maintenance period constraint, we must turn off the machine since y Using the rules shown in above two paragraphs, we can get the start times of all operations from beginning to the end on each machine if this schedule does not violate the peak demand constraint. Then, the method dealing with peak demand constraint is described here. To guarantee the energy consumption in each interval is below the peak demand threshold, we schedule the operations from the first interval to the last interval one by one. The time window of the tth interval is [WS t , WE t ]. Let J tj be the job started before WS t and still not finished yet at WS t on machine M j . p 0 . If D A t > D, the peak demand constraint has already been violated. Then, we need to modify the speed level of J tj in the chromosome to reduce D A t . When D A t ≤ D, we can always find a feasible schedule for the chromosome by delaying the follow-up operations, which is shown in the following procedure.
Let age j be the age of machine M j . Let o f f j be the state of machine M j .
The equation o f f j = 0 means that M j is shutdown. Otherwise, o f f j = 1 and M j is running.
Step 2. Set D A t = 0. Step 3. Calculate the energy consumption of J tj ∀ j in the tth interval, which is D J tj . Then, we have then change the speed level of J tj and update D A t . Calculate the finish time of J tj for each machine, which is f J tj .
Step 5. If j > m, then set t = t + 1, go to Step 2. Else, if f J tj > WE t , set j = j + 1, go to Step 6.
Else, go to Step 6. Step 6. If D A t ≥ D, then set o f f j = 0, j = j + 1, go to Step 5; else, go to Step 7.
Step 7. Let J tjs be the successor of J tj . Let p 0 J tjs be the basic processing time of J tjs . If p 0 J tjs + age j > PT j , then set o f f j = 0 and perform a PM to set age j = 0.
Step 8. Find the earliest allowed start time of J tjs , which is s J tjs . If s J tjs > WE t , then go to Step 9; else, go to Step 11. Step 9. If o f f j = 0, then set j = j + 1, go to Step 5; else, go to Step 10.
Step 10. Calculate the energy consumption of the remaining idle time of this interval, which is Idle t . If Step 5. Else, set o f f j = 0, j = j + 1, go to Step 5.
Step 11. If o f f j = 1, then try to start the job J tjs as early as possible, update D A t , go to Step 12; else, go to Step 14.
Step 12. If the idle consumption between J tjs and J tj is smaller than e st j , then set o f f j = 0. If the length of this idle time is longer than pt j and o f f j = 0, then set age j = 0. Go to Step 13.
Step 13. If the finish time of J tjs is smaller than WE t , then set age j = age j + p 0 J tjs , consider the next operation in this machine, go to Step 7. Else, this job will be one job started in the tth interval and finished in the following intervals, set j = j + 1, go to Step 5.
Step 14. If e st j + D A t < D, then set o f f j = 1, D A t = D A t + e st j , go to Step 11. Else, set j = j + 1, go to Step 5.

Numerical Results
The algorithm was programmed in the C# platform and the program was run on a Hewlett-Packard laptop with an Intel Core i5 2.50 GHz CPU and 8 GB RAM.

Validation of Algorithm
The problem parameters are set as follows. The length of interval, δ, equals 15 min. The basic processing times of operations are uniformly generated from an interval [10 min, 60 min]. The length of PM period equals 24 h. For each machine, there are six speed levels. Accordingly, the speed set is {1, 1/0.9, 1/0.8, 1/0.7, 1/0.6, 2} for different levels and the power demand set is {1, 1.2, 1.5, 1.8, 2.2, 2.8}. Then, the energy consumption per unit time (1 min) equals 1 unit if the machine processes a job with speed Level 1. The energy consumption per unit time is 0.5 units when the machine is idle. The energy consumption caused by one setup equals 10 units. The production line is composed of five machines. The number of jobs n is set to {50, 100, 200}. The length of maintenance is set to {60 min, 180 min, 300 min}. The peak demand threshold D is set to {8, 9, 10, 11, 12}. Accordingly, the energy consumption limit in one interval equals to {120, 135, 150, 165, 180}. Thus, there are 3 × 3 × 5 = 45 scenarios. For each scenario, three instances are generated randomly, which leads to 135 instances in total.
GA is a kind of algorithm that less depends on the parameters. There are only four parameters in this algorithm: population size, crossover probability, mutation probability, and the maximum generation. For the population size, a larger size usually brings a larger potential to find good solutions. However, a larger size means more computation time. Considering the number of jobs, the size is set to 200. Usually, a large crossover probability is recommended to increase the search ability of GA; a small mutation probability is recommended to guarantee the stability of GA. After calibration on a small set of random instances to evaluate the performance of different probabilities, the crossover probability is set to 0.8 and the mutation probability is set to 0.05. For the maximum generation, it is obvious that better solutions can be found with the evolution of population and more computation time is required for a larger maximum limit. Here, we show the details of one instance in Figure 3. In the 50th generation, 99 solutions are obtained on the frontier. In the 100th generation, 156 solutions are obtained on the frontier. In the 1000th generation, 200 solutions are obtained on the frontier, which means all solutions of population are located on the frontier. The frontier obtained in the 3000th generation is very close to that in 1000th generation, which shows the convergence of our algorithm. Thus, we set the maximum generation 3000 in the following numerical experiments. generation. For the population size, a larger size usually brings a larger potential to find good solutions. However, a larger size means more computation time. Considering the number of jobs, the size is set to 200. Usually, a large crossover probability is recommended to increase the search ability of GA; a small mutation probability is recommended to guarantee the stability of GA. After calibration on a small set of random instances to evaluate the performance of different probabilities, the crossover probability is set to 0.8 and the mutation probability is set to 0.05. For the maximum generation, it is obvious that better solutions can be found with the evolution of population and more computation time is required for a larger maximum limit. Here, we show the details of one instance in Figure 3. In the 50th generation, 99 solutions are obtained on the frontier. In the 100th generation, 156 solutions are obtained on the frontier. In the 1000th generation, 200 solutions are obtained on the frontier, which means all solutions of population are located on the frontier. The frontier obtained in the 3000th generation is very close to that in 1000th generation, which shows the convergence of our algorithm. Thus, we set the maximum generation 3000 in the following numerical experiments. Since it is the first attempt to solve this integrated problem, no algorithm can be found in the literature to solve the model proposed in this paper. Thus, we need to design a method as the benchmark to validate the effectiveness of NSGA-II. It is well known that NEH heuristic is very effective when minimizing the makespan of flow shop scheduling problem without the energyrelated consideration. It is also common that the decisions of production and maintenance are determined independently by two departments in sequential. Combining the idea of NEH and the realistic situation in industrial plants, a constructive heuristic CH is designed as follows.
First, one machine can only process all jobs in a same speed level. Since there are five machines and six levels, we can get 6 5 = 7776 combinations from {1,1,1,1,1} to {6,6,6,6,6}. Second, for each speed combination, we can get the actual processing times of all jobs in all machines. Using the NEH method, Since it is the first attempt to solve this integrated problem, no algorithm can be found in the literature to solve the model proposed in this paper. Thus, we need to design a method as the benchmark to validate the effectiveness of NSGA-II. It is well known that NEH heuristic is very effective when minimizing the makespan of flow shop scheduling problem without the energy-related consideration. It is also common that the decisions of production and maintenance are determined independently by two departments in sequential. Combining the idea of NEH and the realistic situation in industrial plants, a constructive heuristic CH is designed as follows.
First, one machine can only process all jobs in a same speed level. Since there are five machines and six levels, we can get 6 5 = 7776 combinations from {1,1,1,1,1} to {6,6,6,6,6}. Second, for each speed combination, we can get the actual processing times of all jobs in all machines. Using the NEH method, we can get a sequence of jobs for this speed combination. Then, ignoring the peak demand constraint, the PMs are inserted as late as possible into the production planning and the machines' on/off is determined aiming at reducing the energy consumption. It means that we construct 7776 solutions. Finally, we select and keep the solution if it satisfies the following two conditions: (1) the solution does not violate the peak demand constraint at any time; and (2) the solution is not dominated by any other feasible solutions obtained by CH.
For each instance, we compare the solutions obtained by NSGA-II and CH. It is not easy to judge the comparison between two multi-objective algorithms since a set of Pareto solutions are obtained using the algorithm instead of one optimal solution. Coverage metric is the most famous criteria to evaluate the difference between two Pareto frontiers. Let A denote NSGA-II. Let B denote CH. The comparison results are shown in Table 2.   Table 2 shows that most of solutions obtained by CH are dominated by NSGA-II. Furthermore, no solutions obtained by NSGA-II are dominated by CH in all instances. Since Table 2 only shows the performance in percentage, we also record the number of solutions obtained by algorithms in Table 3. In this table, we can find that the number of solutions obtained by NSGA-II is much larger than that of CH. For the NSGA-II, the population size is 200, which means that all individuals in the population after evolution are located on the first frontier. For the CH, the number of solutions is larger for the instances with smaller n and larger peak limit. The reason behind this fact is that more solutions can be kept at the last step of CH when peak demand constraint is looser. However, the performance of CH is very poor considering that 7776 solutions are generated at the beginning of CH. More solutions will provide more choices for the managers.

C(A,B) C(B,A) C(A,B) C(B,A) C(A,B) C(B,A) C(A,B) C(B,A) C(A,B) C(B,A)
The metric can only tell us which frontier is better. However, the extent of improvement and the solution diversity cannot be obtained via the values in the metric. Thus, we draw the Pareto frontiers obtained by different algorithms in Figure 4 for the instance when peak limit is 150 and maintenance time is 300. As shown in Figure 4a-c, NSGA-II performs much better than CH in solution quality and diversity. It also provides us an intuitive illustration for the conclusions obtained from the metric. Meanwhile, we record a "FAST" solution in these figures. A "FAST" solution is that all machines process jobs at the highest speed level to finish all jobs as quickly as possible. Although the makespan of "FAST" solution is about 10% shorter than the fastest solution obtained by NSGA-II, we find that the peak demand of system is larger than the threshold during more than half intervals for the "FAST" solution. It means "FAST" solution is strongly infeasible. diversity. It also provides us an intuitive illustration for the conclusions obtained from the metric. Meanwhile, we record a "FAST" solution in these figures. A "FAST" solution is that all machines process jobs at the highest speed level to finish all jobs as quickly as possible. Although the makespan of "FAST" solution is about 10% shorter than the fastest solution obtained by NSGA-II, we find that the peak demand of system is larger than the threshold during more than half intervals for the "FAST" solution. It means "FAST" solution is strongly infeasible.    The computation time of algorithm mainly depends on the number of jobs. The average computation times of NSGA-II are 2 (n = 50), 8 (n = 100), and 12 min (n = 200). The average computation times of CH are 2 (n = 50), 15 (n = 100), and 45 min (n = 200). Compared with NSGA-II, the computation time of CH increases rapidly with the increment of n. Thus, NSGA-II is better than CH when solving large-sized problems from the viewpoints of solution accuracy and computation time.

Impact of Constraints
The tradeoff between makespan and energy consumption can be found clearly in Figure 4. In the middle part of NSGA-II line, the reduction rate of energy consumption is nearly proportional to the increase rate of makespan. The managers of manufacturing plant need to select appropriate solution based on the production due date. Since PM constraint and peak demand constraint are two important factors in this problem, we show their impacts on the results in this subsection. Figure 5 shows the difference between solutions obtained by NSGA-II under different peak limits. When the peak limit is larger, the length of the line is longer in this figure, which means there are more choices for the managers in this situation. Correspondingly, the managers cannot finish the jobs too early when the peak limit is small even if they can afford a large amount of energy consumption. Besides, the right parts of three lines are almost overlapped. The line under the larger peak limit is slightly better than the line under the smaller peak limit. Then, the impact of peak limit is very small for those managers who want to save the energy consumption with the sacrifice of makespan. On the contrary, the impact of peak limit is large for other managers. The tradeoff between makespan and energy consumption can be found clearly in Figure 4. In the middle part of NSGA-II line, the reduction rate of energy consumption is nearly proportional to the increase rate of makespan. The managers of manufacturing plant need to select appropriate solution based on the production due date. Since PM constraint and peak demand constraint are two important factors in this problem, we show their impacts on the results in this subsection. Figure 5 shows the difference between solutions obtained by NSGA-II under different peak limits. When the peak limit is larger, the length of the line is longer in this figure, which means there are more choices for the managers in this situation. Correspondingly, the managers cannot finish the jobs too early when the peak limit is small even if they can afford a large amount of energy consumption. Besides, the right parts of three lines are almost overlapped. The line under the larger peak limit is slightly better than the line under the smaller peak limit. Then, the impact of peak limit is very small for those managers who want to save the energy consumption with the sacrifice of makespan. On the contrary, the impact of peak limit is large for other managers.   Table 4. The "left solution" is the extreme solution located at the left end of line, which denotes the solution with the smallest makespan. The "right solution" is the extreme solution located at the right end of line, which denotes the solution with smallest energy consumption. Comparing the solutions under different peak limits, the pattern is exactly the same with that in Figure 5. Besides, the makespan under n = 100 is about two times the makespan under n = 50. The same trend holds for the energy consumption.    Table 4. The "left solution" is the extreme solution located at the left end of line, which denotes the solution with the smallest makespan. The "right solution" is the extreme solution located at the right end of line, which denotes the solution with smallest energy consumption. Comparing the solutions under different peak limits, the pattern is exactly the same with that in Figure 5. Besides, the makespan under n = 100 is about two times the makespan under n = 50. The same trend holds for the energy consumption. Figure 6 shows the difference between solutions obtained by NSGA-II under different maintenances. It is obvious that the solutions are better when the maintenance time is shorter. The minimum energy consumptions are the same in different situations, which means there is the same ultimate limit for the reduction of energy consumption. Meanwhile, the difference between the lengths of three lines is very small. The above pattern not only holds for this random instance, but also holds for all instances. It can be proven by the numerical results in Table 4. lengths of three lines is very small. The above pattern not only holds for this random instance, but also holds for all instances. It can be proven by the numerical results in Table 4.

Comparison between Different Solutions
In Table 4, the makespan of the "left solution" is 3101 when pt = 180, n = 100, and Peak = 120. The energy consumption is 20,271 for this solution. The two objectives of the "right solution" are (4030, 17,146). The detailed energy consumption profiles of these solutions are provided in Figure 7. In the figures, we can find that most of consumptions are caused by processing jobs no matter which solution is considered. Comparing Figure 7a,b, we find that the difference of energy consumption between intervals is rather large for the "left solution". However, the energy consumption equals 75 in many intervals for the "right solution". The reason behind the fact is that each machine processes jobs with speed Level 1 most of the time. Then, five machines consume 5 units per unit time, which results in 5 × 15 = 75 units during one interval.
Next, two other solutions are compared. One is the "FAST" mode solution, which is infeasible. The other one is a feasible solution obtained by modifying the "FAST" solution. The detailed energy consumption profiles of these two solutions are provided in Figure 8. In Figure 8a, we find that the makespan can be very small if we ignore the peak demand constraint. However, when we use this plan in the circumstance with peak demand constraint, its real performance is very bad since Figure  8b shows that the feasible solution has longer makespan and larger energy consumption at the same time.

Comparison between Different Solutions
In Table 4, the makespan of the "left solution" is 3101 when pt = 180, n = 100, and Peak = 120. The energy consumption is 20,271 for this solution. The two objectives of the "right solution" are (4030, 17,146). The detailed energy consumption profiles of these solutions are provided in Figure 7. In the figures, we can find that most of consumptions are caused by processing jobs no matter which solution is considered. Comparing Figure 7a,b, we find that the difference of energy consumption between intervals is rather large for the "left solution". However, the energy consumption equals 75 in many intervals for the "right solution". The reason behind the fact is that each machine processes jobs with speed Level 1 most of the time. Then, five machines consume 5 units per unit time, which results in 5 × 15 = 75 units during one interval.
Next, two other solutions are compared. One is the "FAST" mode solution, which is infeasible. The other one is a feasible solution obtained by modifying the "FAST" solution. The detailed energy consumption profiles of these two solutions are provided in Figure 8. In Figure 8a, we find that the makespan can be very small if we ignore the peak demand constraint. However, when we use this plan in the circumstance with peak demand constraint, its real performance is very bad since Figure 8b shows that the feasible solution has longer makespan and larger energy consumption at the same time.
The other one is a feasible solution obtained by modifying the "FAST" solution. The detailed energy consumption profiles of these two solutions are provided in Figure 8. In Figure 8a, we find that the makespan can be very small if we ignore the peak demand constraint. However, when we use this plan in the circumstance with peak demand constraint, its real performance is very bad since Figure  8b shows that the feasible solution has longer makespan and larger energy consumption at the same time.

Discussions
The model established in this paper integrates the production scheduling, PM planning, and energy controlling for the flow shop under peak demand constraint, which also considers the machine's on/off and speed selection. In Table 1, we can find that the model is unique in the literature. Although it is impossible to directly compare our numerical results with the findings of other references from the viewpoint of quantity, the main conclusions can be compared qualitatively to imply the managerial insights for the practitioners in real industry.
The traditional research about flow shop scheduling problem hard performed good work to search the optimal jobs' sequence; for example, NEH method based on the calculation of processing time can provide a solution within a short computation time. However, the assumptions of those references are too ideal and the solution cannot work well in the real plant. The optimal jobs' sequence not only depends on the production requirements but also is strongly affected by the constraints of maintenance planning and energy consideration. According to the findings of our research, the manager needs to balance the makespan and energy consumption firstly. If the jobs are rush orders, then the manager should choose a high speed level for the machines in order to reduce the makespan and bear the consequence of increasing energy cost. Otherwise, the manager would rather choose a low speed level for the machines than finish the jobs quickly with the sacrifice of energy cost. Different speed levels lead to different processing time for a same job, which will strongly affect the optimal jobs' sequence. For the manager who has decided the relative importance between makespan

Discussions
The model established in this paper integrates the production scheduling, PM planning, and energy controlling for the flow shop under peak demand constraint, which also considers the machine's on/off and speed selection. In Table 1, we can find that the model is unique in the literature. Although it is impossible to directly compare our numerical results with the findings of other references from the viewpoint of quantity, the main conclusions can be compared qualitatively to imply the managerial insights for the practitioners in real industry.
The traditional research about flow shop scheduling problem hard performed good work to search the optimal jobs' sequence; for example, NEH method based on the calculation of processing time can provide a solution within a short computation time. However, the assumptions of those references are too ideal and the solution cannot work well in the real plant. The optimal jobs' sequence not only depends on the production requirements but also is strongly affected by the constraints of maintenance planning and energy consideration. According to the findings of our research, the manager needs to balance the makespan and energy consumption firstly. If the jobs are rush orders, then the manager should choose a high speed level for the machines in order to reduce the makespan and bear the consequence of increasing energy cost. Otherwise, the manager would rather choose a low speed level for the machines than finish the jobs quickly with the sacrifice of energy cost. Different speed levels lead to different processing time for a same job, which will strongly affect the optimal jobs' sequence. For the manager who has decided the relative importance between makespan and energy cost, he/she still needs to solve the integrated model proposed in this paper to find the optimal jobs' sequence since the machines will be shut down when PM is performed.
The execution of PMs should be planned to avoid disturbing the processing of jobs. According to the results in [25], manager should make full use of the idle time between consecutive jobs to perform PMs, which will lead to more PMs than the minimum required number of PMs in order to minimize the makespan. However, the machine needs to be shut down when performing PM, which means that the additional energy needs to be consumed when turning on the machine again. Thus, the manager needs to consider the energy consumption of machines' setup. If it is small, then the suggestion in [25] still works well. Otherwise, the manager should pay attention to the number of PMs in order to reduce the waste of energy. Thus, the impact of PM on the production is not only caused by the unavailability of machine but also via the channel of energy consumption.
The authors of [33][34][35][36][37][38][39] investigated when to turn off the machine during the production horizon in order to reduce the energy consumption of idle machine. Let γ be the power demand per unit time of idle machine. There is a tie between the consumption of idle machine and the setup consumption, i.e., if the length of idle time is longer than τ = setup/γ, the machine should be shut down. Therefore, the arrangement with many small intervals with shorter time is not good; the arrangement with a few large intervals with length τ is encouraged. The above suggestion is very useful when manager wants to save the energy cost. In addition, we find that the interval of idle time is a good opportunity for performing PMs. If the length of one maintenance is shorter than τ, a PM can be performed here to improve the reliability of machine since it has already been shut down. If the length of one maintenance is longer than τ, then we suggest the manager arrange the production to generate idle intervals whose lengths equal the maintenance time rather than τ.
According to the research results of selecting machines' speed levels to adjust the tradeoff between makespan and energy consumption in literature, it is suggested to set the highest level for each machine in order to finish the jobs as quickly as possible. However, it will lead to a high power demand of the production system. If the peak demand limit is tight, the solution's real performance becomes very bad, as shown in Figure 8. It shows the manager that the makespan cannot be constantly decreased due to the peak demand constraint. Furthermore, the greater the peak limit is, the more the makespan can be reduced. This finding suggests manager to consider the peak limit when evaluating whether a production line can complete an order on time or not. According to the results in this paper, the speed levels of different machines should be matched to guarantee that the maximum power demand is below the limit. Some machines, in which the job's basic processing time is longer, should work with high speed level. The other machines, in which the job's basic processing time is shorter, should work with low speed level. Then, the makespan can be reduced as much as possible under the condition with a good performance in the energy aspect.

Conclusions
This paper integrates three aspects, namely production, preventive maintenance, and energy consideration for the flow shops, with the PM constraint and peak demand constraint. A mathematical model is established, and a meta-heuristic based on NSGA-II and decoding method is designed to solve the model effectively. Using this approach, the Pareto frontier can be obtained to balance the tradeoff between economic objective and environment objective. The numerical results show that impacts of PM constraint and peak demand constraint need to be analyzed case by case.
This research is closely related to the practical application in industrial plant. The main practical implications can be formulated from four points. First, the operations management problem of manufacturing plant is a complicated topic coupled with different aspects. The effectiveness of research will be discounted if researchers only focus on the theoretical study in one aspect. The only way to fill the gap between theory and reality is investigating the integration problem. Second, plant manager can decrease the makespan if more energy consumption can be accepted by the manager. However, the makespan cannot be constantly decreased due to the peak demand constraint. Third, the length of maintenance will strongly affect the makespan and it does not affect the total energy consumption. The positions of PMs should be determined while considering the machines' on/off decision. Fourth, although the total energy consumption is caused by three parts, processing jobs occupies most of energy consumption. Thus, plant manager should pay attention to the processing jobs instead of idle time and setup consumption.
There are several limitations which can be found in the problem assumption. First, we assume that the periodical PM gets rid of the unexpected machine failures and the plant environment is deterministic. Therefore, we establish a deterministic mathematical model in this paper. In the future, the impact of uncertain factors can be considered, e.g., the unexpected machine failures, the rush orders, and the temporary blackout of electricity. Second, we use the total energy consumption as the energy objective. Therefore, it is unnecessary to insert idle times into the schedule to adjust its consumption pattern. In the future, the time-of-use tariff, which provokes the energy users to shift the power demand from peak-hours to valley-hours, can be considered. Third, we assume that there is only one machine in each stage of flow shop. Therefore, the machine allocation problem is avoided in this paper. In the future, parallel machines in each stage can be considered to compose the hybrid flow shop. In addition, only one energy resource is considered in this paper. Renewable energy resources such as solar energy and wind power are encouraged to be alternatives of fossil fuels since renewable resources are cleaner and greener. In the future, it is also very interesting to investigate how to supply the power demand of plant using the micro-grid system.