Digital Twin Based Optimization of a Manufacturing Execution System to Handle High Degrees of Customer Speciﬁcations

: Lean production principles have greatly contributed to the e ﬃ cient and customer-oriented mass production of goods and services. A core element of lean production is the focus on cycle times and designing production controls and bu ﬀ ers around any bottlenecks in the system. Hence, a production line organized by lean principles will operate in a static or at least quasi-static way. While the individualization of products is an interesting business approach, it can inﬂuence cycle times and in-time production. This work demonstrates how performance losses induced by highly variable cycle times can be recovered using a digital twin. The unit under analysis is an industrial joiner’s workshop. Due to the high variance in cycle time, the joinery fails its production target, even if all machines are below 80% usage. Using a discrete event simulation of the production line, di ﬀ erent production strategies can be evaluated e ﬃ ciently and systematically. It is successfully shown that the performance losses due to the highly variable cycle times can be compensated using a digital twin in combination with optimization strategies. This is achieved by operating the system in a non-static mode, exploiting the ﬂexibilities within the systems.


Scope
The motivation for this paper is the continuing production planning needs that are associated with the move from batch manufacturing to single-piece flow. This move requires major changes to existing production planning processes [1]; many challenges have been experienced with the move to reduced batch sizes in support of lean production management. The efficiency gains from single piece flow (SPF) in a factory are often negated by the impact from the additional production planning and controlling processes [2] and many firms move to a hybrid approach based on smaller batch sizes and manage the variability though the use of Kanbans. Kanbans can either be static or adaptive in such production control systems [3]. However, this does not address the challenge that new adaptive production planning tools are presenting as they provide the opportunity for mass customization. Mass customization allows the firm to deliver the technical requirements of a customer, but brings with it complications to the production planning and control processes [4]. Cost data and cycle times are important within the production process, but also for the sales team during the phase from quoting Given a set of jobs requiring significantly different cycle times on a set of limited stations, the TTC will depend on the number of jobs and the scheduling: a slow job can delay a fast job if the latter cannot overtake it. These kind of scheduling problems are non-polynomial complete (NP-C) [13]. The effort to find the time-optimal strategy increases in a nonlinear fashion with the problem size (i.e., the number of jobs), while the problem cannot be solved efficiently. In computation science, this kind of problem is common in scheduling applications for real-time systems: find a schedule allowing the completion of a set of tasks using limited resources in time. Examples in computational science show that the problem's complexity is significantly reduced if one looks only for a feasible solution by applying strategies such as shortest job first (SJF) [14]. Given a prediction of the resources and cycle times required, as well as a model of the system, finding a feasible schedule is a procedure that the literature shows can be mastered.
Modelling and simulation of production systems enable the system's performance to be predicted for a given a scenario. Material flows of linear, as well as branched, production systems can be modelled using discrete-event (DE) simulations [15,16]. Implementations for DE simulation for production systems require at least one list of events, a list of stations and one list of moveable objects (MO). The latter can represent material and/or information, while the list of stations represents the available machines and cells within the modelled system, as well as their current state [17]. There are several commercial tools to use for DE simulation in the context of production systems, such as Plant Simulation [18], DDDSimulator and AnyLogic [19].
By synchronizing simulation and data models with the physical system, a digital twin of the real system is obtained. According to Stark and Damerau "a digital twin is a digital representation digital representation of an active unique product (real device, object, machine, service, or intangible asset) or unique product-service system (a system consisting of a product and a related service) that comprises its selected characteristics, properties, conditions, and behaviors by means of models, information, and data within a single or even across multiple life cycle phases" [8]. A well-founded review of the types and applications of digital twins is presented by Jones et al. [9]. According to the authors, a fundamental property of a digital twin is the standardized and/or automatized data exchange between the physical and the digital world. A detailed characterization of a digital twin for production systems is provided by the ISO/DIS 23247 [20][21][22][23]. This virtual representation of the real system enables timeand cost-efficiency, as well as the safe testing and optimization of new operational modes. Application of digital twins in production systems are presented by Park et al. [24] and Ding et al. [25].

Materials and Methods
The aim of the approach presented here is the systematic offline identification of a feasible production schedule, enabling the completion of the intended throughput of highly customized parts within time. The approach follows the concept presented in [12] and combines lean production with a digital twin of the production line, enabling a feasible schedule to be produced.
Before going into technical details, the following simplified example will demonstrate the outline of this approach. The example is a linear production line with five process steps and a batch of 15 production jobs. The order of production jobs along the line is fixed; jobs cannot overtake each other. On average, the line is balanced; hence, the expected cycle time for each job is approximately equal in every process step. This average cycle time is selected such that the batch's TTC is approximately 6 h. The goal is to finish within one shift of 8 h. Then, normal distributed noise in the jobs cycle times is added. While the variance of this noise is small enough, the TTC of (almost) all scenarios remains lower than 8 h (see the blue bars in Figure 1). If the variance gets too large, scenarios violating the TTC limit of 8 h must be expected. In Figure 1, there is an example where only 15% of the scenarios finish within time (see the red bars). Hence, two cases must be discussed: one with low variance in cycle time and one with high variance. In both cases, the expected cycle time of each job at each process step is still the same. On average, both lines are still balanced, but they perform completely differently. Assuming a model of the line is available and able to predict the job-specific cycle times at each process step, one can come up with a clever schedule for the jobs within one batch to avoid jams as much as possible. Repeating the experiment using noise with high variance but applying SJF-scheduling prior to the batch's execution a TTC of less than 8 h in 95% of the scenarios (see yellow bars in Figure 1) can be obtained. By integrating such a model into the workflow of the planning, execution and supervision of production jobs, a digital twin in the sense of Figure 9 of [9] in page 45 is obtained. This digital twin operates in parallel to the physical production system and operates by reading data from the real system and providing a data transfer to the real system in the sense of a feasible schedule. to avoid jams as much as possible. Repeating the experiment using noise with high variance but applying SJF-scheduling prior to the batch's execution a TTC of less than 8 h in 95% of the scenarios (see yellow bars in Figure 1) can be obtained. By integrating such a model into the workflow of the planning, execution and supervision of production jobs, a digital twin in the sense of Figure 9 of [9] in page 45 is obtained. This digital twin operates in parallel to the physical production system and operates by reading data from the real system and providing a data transfer to the real system in the sense of a feasible schedule. In this approach, this simple example is transferred to a real non-linear production system following the same schematics as in the example. A digital twin is implemented to predict the TTC of a batch and apply an optimization strategy to the production schedule to obtain a feasible production sequence. Hence, the methodology includes three important steps: 1. Analysis of the system and identification of the relevant machines, buffers, and transportation systems, as well as the abstraction of the jobs to be optimized; 2. Development and implementation of a digital twin of the branched production system able to predict the relevant cycle and lead times for a given production batch; 3. Apply sophisticated optimization routines to the production schedule in order to identify a feasible schedule for the given batch.
For step 1, a walk-through similar to a Gemba walk [26] is performed. During this walk-through, we familiarize ourselves with the available production process and systems, the material flow and the characteristic dynamics of the overall system. Secondly, the relevant machines are defined and characterized together with the process owner and the relevant production layout is extracted from the existing 2D drawings of the shop floor. In step 2, the production system is deconstructed into a graph. The graph's nodes represent the single machines and buffer places, whereas the edges represent the transportation facilities within the system. Production jobs are modelled as moveable entities on this graph. Whether or not a job is allowed to move depends on a very basic ruleset, similar to the one used in colored time-dependent petri-nets [27,28]. Each job is parametrized by its required production sequence-types of machines and times required-and its due date. For example:  In this approach, this simple example is transferred to a real non-linear production system following the same schematics as in the example. A digital twin is implemented to predict the TTC of a batch and apply an optimization strategy to the production schedule to obtain a feasible production sequence. Hence, the methodology includes three important steps:

1.
Analysis of the system and identification of the relevant machines, buffers, and transportation systems, as well as the abstraction of the jobs to be optimized; 2.
Development and implementation of a digital twin of the branched production system able to predict the relevant cycle and lead times for a given production batch; 3.
Apply sophisticated optimization routines to the production schedule in order to identify a feasible schedule for the given batch.
For step 1, a walk-through similar to a Gemba walk [26] is performed. During this walk-through, we familiarize ourselves with the available production process and systems, the material flow and the characteristic dynamics of the overall system. Secondly, the relevant machines are defined and characterized together with the process owner and the relevant production layout is extracted from the existing 2D drawings of the shop floor. In step 2, the production system is deconstructed into a graph. The graph's nodes represent the single machines and buffer places, whereas the edges represent the transportation facilities within the system. Production jobs are modelled as moveable entities on this graph. Whether or not a job is allowed to move depends on a very basic ruleset, similar to the one used in colored time-dependent petri-nets [27,28]. Each job is parametrized by its required production sequence-types of machines and times required-and its due date. For example: Operation times are determined and are specified in the production plan for each job. If the next operation time is 0, the job is sent to the subsequent workstation with a positive operation time. Machine queues are FIFO and transport times between workstations are constant, i.e., the jobs cannot overtake each other within a queue. A set of jobs is merged into one batch constrained by a starting time equal for all the jobs and a finishing time based on the due date of each job. No job is allowed to enter the system before the start time, while all jobs should be finished before the finishing time.
Using this outline, a DE simulation is implemented. A batch of production jobs and a schedule coming from the PPS in place are input to the simulation, along with the starting date. The output of the simulation is the TTC and finishing date for each job in the batch. For the analysis and visualization of the results, Gantt-charts of the jobs sequence are used, as well as a time-discrete 3D visualization of the shop floor. The Gantt-charts are used to provide input to the planner and to the machine operator. In the 3D visualizations, the floorplan of the factory is shown to scale and overlaid with the jobs moving through the shop floor. This visualization supports the verification of the results of the planner and enhances the trustfulness of the digital twin by providing a known visual input.
Using a modified sorting algorithm, the solution space is traversed, looking for a feasible schedule by modifying the production order-and thereby their starting times-within the batch. An exhaustive sweep over the solution space is not possible due to computational limitations, i.e., the expected time required exceeds the time available between two batches. Figure 2 shows the outline of the described digital twin with the procedures discussed. The core of the digital twin includes a graph-based system model used in a DE simulation acting as a virtual copy of the real system. If provided with an (existing) job sequence from the PPS in place, the digital twin will predict the expected material flows and machine utilization and visualize them as a Gantt-chart and 3D-animation. An additional optimization loop enables the systematic virtual testing of different production sequences with respect to TTC in order to find a feasible solution, i.e., a job sequence that is able to finish within time. These visualized pieces of information can then be used in the physical system's PPS (e.g., optimal job routine) or shop floor meeting (e.g., training the operators for the upcoming week's procedures). Since the digital twin implements its own model of the system, it is thus more than a pure shadow of the data available in the PPS. chart and 3D-animation. An additional optimization loop enables the systematic virtual testing of different production sequences with respect to TTC in order to find a feasible solution, i.e., a job sequence that is able to finish within time. These visualized pieces of information can then be used in the physical system's PPS (e.g., optimal job routine) or shop floor meeting (e.g., training the operators for the upcoming week's procedures). Since the digital twin implements its own model of the system, it is thus more than a pure shadow of the data available in the PPS.

Results
In the following section, the application of the approach is demonstrated on an existing production facility. The company in this application is a Swiss wood joinery firm, which specializes in manufacturing custom products, mainly furniture, doors and windows. The firm handles about 400 projects per year. A major focus of the company is product customization, with a customer-oriented design process and a flexible production plan, supporting single-lot product realization. The company mainly serves business-to-business (B2B) customers, with the majority of products being single pieces and/or small-lot production commissioned by architects and designers. The manufacturing process is therefore strongly dependent on product requirements and subject to changes, according to the customers' specifications. In general, these products consist of a wooden structure made of joined parts (e.g., a window frame), including metal components and joints (e.g., closing devices and hinges). Two use-cases are presented: the first focuses on job sequence optimization, while the second use-case shows the application of the same procedures to the planning of maintenance.

System Analysis and Digital Twin Implementation
The typical workflow (not considering raw material procurement) begins with the sawing of the wooden panels. In order to optimize pattern cutting, the company uses software for placement of components on each panel. Based on the list of pieces to be produced and the panels available, the software calculates the best allocation to minimize material consumption, sectioning times and production costs. Once the sawing process is completed, the panels are labelled and, based on the specifications, some panels are edge banded. Afterwards, products are drilled and milled on two different machines. At this point, the operator inserts the metal components and joineries, or directly sends the parts for final assembly. Once the assembly process has been completed, products are packed and shipped.
Currently, decisions related to job sequence and part routing inside the plant are taken based on a rule of thumb, instead of process-specific data. The current open issues related to the production process revealed during the walk-through are:

•
There are no specific rules that regulate the products entering the drilling/milling process; • The production manager does not have information about the trend of the production process; • The impossibility of defining reliable planning of production throughout the week/month; • The impossibility of defining a due date with certainty; • The impossibility of providing clear estimations of production costs and associating them with specific products, both in the design phase and at budget closure.
For these reasons, the production is subject to several delays and uncertainties that increase its complexity. The situation led to the decision to use a digital twin to understand and optimize the production plans. Figure 3 shows the identified topology of production identified during the walk-though and used by the digital twin. The system is modelled as a total of 21 process steps in a network with five branches. These process steps are located within the layout shown in Figure 4. The model and simulation part of the digital twin is implemented using DDDSimulator environment from Technology Transfer System S.r.l. This simulation environment allows users to build logic module prototypes representing elements of the system to simulate (machines, workplaces) programming in Java. Such modules can then be used as instances in a visual drag-drop environment and connected to compose the logics of the simulation model. Moreover, DDDSimulator provides a virtual 3D environment, which can show material flows and machine movements. This is especially helpful for the process owners in understanding the simulation results. During the test application, the digital twin is deployed as a lightweight compiled Java project (Java Archive, JAR).

Use-Case: Feasible Schedule
In this use-case, the digital twin is used on an existing batch. To do so, the job characteristics are exported from the existing PPS. In the default production sequence (see Table 1), the batch cannot be finished in time, and five of nine jobs (55%) cannot adhere to their deadline, even if sufficient machine capacity and time is available. This leads to a total of 22 days of cumulative delay. In a second step, the production sequence was optimized using the digital twin. As presented in Table 1, following this, only two of nine jobs (22%) fail to meet the deadline after optimization, with a cumulative delay of 6 days. Compared to the initial scenario, this translates to an improvement of 60% in deadline compliance (measured as number of jobs with missed deadlines) and 72% improvement in cumulative delay. The effects of optimization with regards to the production sequence are shown in Figures 5  and 6, respectively. After optimization, not all the jobs are compliant, but it can be observed that Job 1 will always be late even if it is loaded on day 1, due to its duration. capacity and time is available. This leads to a total of 22 days of cumulative delay. In a second step, the production sequence was optimized using the digital twin. As presented in Table 1, following this, only two of nine jobs (22%) fail to meet the deadline after optimization, with a cumulative delay of 6 days. Compared to the initial scenario, this translates to an improvement of 60% in deadline compliance (measured as number of jobs with missed deadlines) and 72% improvement in cumulative delay. The effects of optimization with regards to the production sequence are shown in Figures 5 and 6, respectively. After optimization, not all the jobs are compliant, but it can be observed that Job 1 will always be late even if it is loaded on day 1, due to its duration.  Figure 5. Production schedule prior to optimization for the use-case.

Use-Case: Planning of Machine Maintenace
Planned machine maintenance and service are important elements to ensure continuous quality standards and availability. In many cases, maintenance will affect the production, since machines are not available while being serviced. In lean production, this challenge is addressed by the concept of total productive maintenance (TPM, [10]). By doing so, planning of production and planning of maintenance are not treated as separate topics, but optimized together with respect to the overall benefit for the company. Using the presented optimization strategy, the concept of TPM can be applied to the joinery use-case. Therefore, a dummy job is introduced for the machine to be serviced. This dummy job includes the required time for the planned maintenance, as well as the earliest date

Use-Case: Planning of Machine Maintenace
Planned machine maintenance and service are important elements to ensure continuous quality standards and availability. In many cases, maintenance will affect the production, since machines are not available while being serviced. In lean production, this challenge is addressed by the concept of total productive maintenance (TPM, [10]). By doing so, planning of production and planning of maintenance are not treated as separate topics, but optimized together with respect to the overall benefit for the company. Using the presented optimization strategy, the concept of TPM can be applied to the joinery use-case. Therefore, a dummy job is introduced for the machine to be serviced. This dummy job includes the required time for the planned maintenance, as well as the earliest date possible and the due date. By doing so, the machine maintenance is planned with respect to the feasibility of the manufacturing schedule, i.e., the completion of the jobs in time should not be affected by the planned maintenance. Figure 7 shows the result of this approach. The scheduler deliberately pre-draws the machine maintenance. Hence, a trade-off between fully exploiting the service interval and not affecting the finishing in time of the jobs is successfully achieved.

Use-Case: Planning of Machine Maintenace
Planned machine maintenance and service are important elements to ensure continuous quality standards and availability. In many cases, maintenance will affect the production, since machines are not available while being serviced. In lean production, this challenge is addressed by the concept of total productive maintenance (TPM, [10]). By doing so, planning of production and planning of maintenance are not treated as separate topics, but optimized together with respect to the overall benefit for the company. Using the presented optimization strategy, the concept of TPM can be applied to the joinery use-case. Therefore, a dummy job is introduced for the machine to be serviced. This dummy job includes the required time for the planned maintenance, as well as the earliest date possible and the due date. By doing so, the machine maintenance is planned with respect to the feasibility of the manufacturing schedule, i.e., the completion of the jobs in time should not be affected by the planned maintenance. Figure 7 shows the result of this approach. The scheduler deliberately pre-draws the machine maintenance. Hence, a trade-off between fully exploiting the service interval and not affecting the finishing in time of the jobs is successfully achieved.

Discussion
This section is broken into three sections, for the implications of the case, then the academic considerations, and finally the industrial considerations.

Case Implications
The firm in the study faces the same challenge as many other engineered-to-order businesses, in that customer specifications cause every job to be different in one way or another. The customer order dimensions that have been examined in this case include the physical size and shape, the delivery dates, the loading order of the delivery, and the machined quality. Within this, there are production factors that are interrelated, as the firm has a mixture of other projects or jobs that need to be routed through the factory. Within the production facility there are other constraints, some physical, some machine-based, and others based on the capabilities of the production workforce. Within the firm itself there is a complex set of negotiations between sales, production and workshop maintenance tasks. The production manager has to take a very balanced view of production capacity, and without the digital twin, the production manager had been using a mixture of spreadsheets and experience. Visiting the workshop and meeting with the production workers, the production manager has been able to balance the conflicting requirements and limitations. However, this often results in a detrimental impact on delivery schedules and potentially a far from optimal position in terms of the firm's commercial objectives.
Customer specifications are challenging, with variable cycle times and job sequencing that result in the loss of advantages from a balanced production line designed according to basic lean principles. The fundamental problem is that balanced production lines have very stiff characteristics and do not dynamically adjust to situations, meaning that deviations due to different specifications can impact all the jobs in the line, often with unintended consequences that, with traditional tools, are difficult to quantify and adjust for in an agile manner. The result is often experienced as a jam in the system, where upstream jobs are hindered by downstream jobs and this results in a significant capacity loss that can be shown in reduced productivity and has often been measured in reduced OEE.
Historically there have been two ways to deal with these production losses. The first is to streamline the product towards a balanced production line (e.g., design for manufacturing, DfM), and thereby avoid the cycle time variations. This is the approach of mass production and standardization; however, this does not suit the move to lot-size-one or mass-customization. The second way to deal with production losses is to learn to exploit the flexibility within the production line, and this work examines the second way. Though optimizing the production order, it has been shown that it is possible to reduce the hindering of up-stream jobs. This is done using the advisory support of the digital twin [29] that provides a small number of feasible solutions based on the system constraints and the associated variability of each step. The job of the digital twin was not to provide the optimal solution but rather to support the production manager with planning by providing feasible solutions that allow the project to be completed on time. It may be possible to find the "best solution", but this in itself may only be transient, as sales send new jobs to production. By focusing on a feasible solution, the computational effort has been shown to be reduced significantly relative to the problem size compared to finding the optimal solution. The digital twin is also able to integrate variability into a situation (e.g., length of time on a specific machine), allowing the production manager to better understand the buffers that exist within the system.
The value for the firm was increased when the production planning system was shared with the sales function, as it allowed them to inform their customer when delivery could be expected. With the integration of the knowledge of the time per task, the costing of each job could be improved. With engineered-to-order production, this can be challenging due to the dynamic nature of job costing, alternative routings could also be identified with by the digital twin providing increased flexibility and agility for the production team. The integration of the visuals [30] with the sales team should make for improved discussions due to the transparency that it provides. In the past, the integration of maintenance could be problematic, however, with the use of the maintenance rules it is now possible to include this in the visual planning, both short-term (e.g., daily/weekly) and longer-term (e.g., monthly/yearly), so that the need for maintenance is not a surprise. Potentially, opportunities for maintenance as well as planned maintenance could be integrated in the digital twin visuals. The implications of unplanned maintenance or breakdowns can be better integrated into the production plan and adaptions made quickly with the consequences clearly visible to all. Visual management and team decision making is a core concept of lean.
The digital twin should, in the longer term, capture knowledge from the shop floor and provide additional insights into what is possible [31]. To achieve this, it could be imagined that data could be captured from the ERP and MES system, and using ML/AI, interacted into the agent-based modelling approach that is the core of the digital twin. Doing so would support further learning and the ability to compare plans to actual results. This would be supportive of the firm when new customer projects are being bid on, where there is limited experience, when hiring or training new staff, or when making investment decisions for new machines.

Academic Implications
The integration of the ERP and MES in terms of supporting production control was identified as important in 2005 by Kelle and Akbulut [32] and, today, this is the basis for many of the software-based decision support systems in supply chains [33,34]. These systems can support production managers to make better-informed decisions around planned and unplanned events and support of basic production planning questions. The dynamic re-planning [35] of production is required today to support the cost optimization of production; nevertheless, it is generally poorly interpreted into the existing ERP and MES systems, even though the problem was described over ten years ago [36]. There is a need to understand why the take-up has been so slow and how ERP and MES systems can be better integrated to provide dynamic rescheduling and forecasting within the factory environment. Further research on how to best provide decision making support to allow factories to deal with single-piece flows in a dynamic environment would be welcomed, given the improvements in the underlying technologies.

Managerial Implications
The work described in this paper is at an initial stage of development, yet it shows that, using advanced modelling tools, it is possible to make better production planning decisions. Single-piece flow brings with it a complexity that does not exist in more traditional batch-based production. The integration of mass customization creates a more complex problem in production for both planning and control. In this case, both have been integrated into an advisory tool to provide support with the planning process within a factory environment where different constraints exist. The approach was not designed to optimize both costs and delivery time and, in effect, provide a fully autonomous scheduling system, but rather to support the factory manager and to provide options with scheduling and allow them to make the final decisions. This is similar to the way Google Maps operates.
The system enabled the integration of changes to system resources to be considered in the longer term (e.g., planned inspections or staff holidays). The planning tool also supported the understanding of the consequences of unplanned events on the overall system and the individual orders.

Conclusions
The goal of the presented use-cases is the successful application of a digital twin in a single-piece production line subject to high variance in cycle time. Using the instruments in place, a digital twin based on a DE simulation is implemented and synchronized with the PPS system of the production line. While the PPS provides production data to the digital twin, the digital twin presents the required information to the operator to finish the queued jobs in time. Using this rather simple setup, a significant improvement in the lateness of individual jobs is achieved. The reason for this success is the combination of the lean tools already in place providing the robustness, and the superior intelligence of the digital twin providing the required flexibility in the system. This paper further demonstrates that the concept presented in the state-of-the-art applications of digital twins in production systems can successfully be transferred to traditional small-scale shop floors and still provide a significant leverage.