Predictive Scheduling with Markov Chains and ARIMA Models

Production scheduling is attracting considerable scientific interest. Effective scheduling of production jobs is a critical element of smooth organization of the work in an enterprise and, therefore, a key issue in production. The investigations focus on improving job scheduling effectiveness and methodology. Due to simplifying assumptions, most of the current solutions are not fit for industrial applications. Disruptions are inherent elements of the production process and yet, for reasons of simplicity, they tend to be rarely considered in the current scheduling models. This work presents the framework of a predictive job scheduling technique for application in the job-shop environment under the machine failure constraint. The prediction methods implemented in our work examine the nature of the machine failure uncertainty factor. The first section of this paper presents robust scheduling of production processes and reviews current solutions in the field of technological machine failure analysis. Next, elements of the Markov processes theory and ARIMA (auto-regressive integrated moving average) models are introduced to describe the parameters of machine failures. The effectiveness of our solutions is verified against real production data. The data derived from the strategic machine failure prediction model, employed at the preliminary stage, serve to develop the robust schedules using selected dispatching rules. The key stage of the verification process concerns the simulation testing that allows us to assess the execution of the production schedules obtained from the proposed model.


Introduction
While conducting their activities, manufacturing enterprises establish a range of various goals. Certainly, one of the common strategic business objectives is to strengthen the market position. An enterprise that aims to broaden the group of clients, as well as foster the already existing business relations, must first and foremost be reliable and deliver quality goods within contractual deadlines [1,2]. Therefore, proper planning of works becomes central to sound execution of production processes. Production scheduling is the solution that can boost the capacity of manufacturers, hence there are numerous scientific publications in the field [3]. Researchers are still taking active efforts to optimise the effectiveness of production jobs scheduling in order to streamline the production planning process [2,4].
Unfortunately, most of the proposed solutions display numerous limitations [5]. It is common practice that the job scheduling algorithms build schedules for idealized production environments, i.e., assuming a static and stable production flow [5,6]. Thereby, a number of disruptive factors are excluded, which would bring the production to a halt in case they occur [7,8]. As a consequence, contractual deadlines would be missed, penalties would be imposed and the manufacturer's credibility would diminish. Among the various uncertainty factors, we can highlight the following [5,9]: • disruptions of resource availability (machine or robot failure) • disruptions of orders (placement of new orders) • disruptions of processes (material shortage, poor product quality) • disruptions associated with misestimation of the ongoing process parameters (incorrect estimation of operation times) • disruptions related to the change in the duration of the operation (employee absence or malaise, shorter or extended operation times) Scheduling production in real manufacturing systems cannot afford to pretend to be disruption-free. It is, therefore, of the essence that scheduling endeavors should consider production problems under uncertainty, which is capable of having a colossal effect on the timeliness of production [10,11]. Given that the more the process changes, the greater its disorganization, the scientific literature in the field of scheduling has recently turned towards robust scheduling [5,9].
The predictive scheduling method proposed in this work employs Markov chains and ARIMA (auto-regressive integrated moving average) models whose combination enables determining the values of the machine failure parameters (time to failure and repair time of the machine). In the next step time buffers are directly integrated into the scheduling process and determine the completion time of the production, which corresponds to the delivery date agreed with the customer.
Section 2 summarizes the essential information regarding robust production scheduling and reviews existing literature. The new methodology for scheduling under machine failure and failure prediction is described in Section 3, and the proposed solutions and results are discussed in the subsequent section. Conclusions and plans for further research work are presented in the last section of the work.

Essentials of Robust Scheduling
The purpose of a robust production schedule is predominantly to absorb potential disruptions, by allowing variability to the production system parameters.
Predictive scheduling-related to the planning stage.

2.
Reactive scheduling-related to the production stage.
A well-executed process of scheduling production jobs must pertain to the first of the phases (also referred to as the offline phase) when the available production data give the foundation for creating [3]: • a nominal schedule-based on the current system parameters, • a robust schedule-based on the assumption of uncertainty and variability of production.
Reasonable scheduling in this phase requires not only implementing appropriate tools but also suitable methods for determining uncertainty factors [6]. Unfortunately, there is a distinct paucity of solutions that consider the impact of process disruptions [5,6].

Existing Literature on Robust Production Scheduling
Due to the practical nature of the problem, robust production scheduling solutions are mainly developed for flow-shop and job-shop systems, which are the prevailing forms of organization in real production systems.
Although robust task scheduling in a flow-shop environment is rather neglected in the literature, a certain number of publications on this issue can be found [13][14][15]. Various approaches have been applied to building robust schedules in a flow-shop environment-from classical local search algorithms [16] to genetic algorithms [17], integer programming applications [18] and dynamic programming [19]. However, in an overwhelming majority, the publications are concerned with predictive-reactive scheduling, thus, tend to focus on the investigation of effective re-scheduling methods [19], and not on the analysis of uncertainty factors itself.
The second most-investigated scheduling problem is robust production scheduling in job-shop systems [20]. This system is a close reflection of a typical production environment, where the order of operations is imposed by the technological routes of their jobs. Researchers have long highlighted that the real job-shop problem requires a distinctly different approach than the shown by the prevailing theoretical tendencies [21], however, to date no clear trend has emerged. Robust scheduling solutions proposed in the aspects of job-shop production processes resemble the solutions for flow-shop systems inasmuch as they are mainly dedicated to the predictive-reactive approach. Standard approaches are shown to draw from various methods, such as genetic algorithms and their hybrids [20,22], immunological algorithms [23,24] and stochastic programming [25]. Other authors propose robust scheduling methods using expert systems [26].

Machine Failure as the Major Uncertainty Factor
Although many uncertainty factors can be named, the failure of technological machinery is still considered to be the central problem in manufacturing. This disturbance is regarded to have the greatest impact on performed processes. Failure will not only halt the production but its consequences will linger throughout the remaining production process [5,7].
From the analyzed scientific papers dealing with the topic of machine failure in scheduling, it can be seen that the search for methods that will enable approaching the problem of machine failure and predicting its occurrence are very much in place. Developing effective prediction methods is extremely important from the perspective of robust scheduling [18,19,26].
To this end, a probability distribution is among the most widely used approaches in the field of failure analysis. Researchers employ typical distributions and their combinations. The failure description proposed by Jensen [27] applies a uniform distribution. A similar solution is proposed by Al-Hinai and ElMekkawy [28], who, however, assume that the probability of failure is constant. In contrast, in their description of production process disturbances, Davenport et al. [29] implement a normal distribution, while Mehta and Uzsoy [21] utilize an exponential distribution. The authors propose the use of interesting approaches, such as the methods based on combinations of various distributions. The latter is used by Gürel et al. [4], who combine normal, triangular and exponential distributions.
Recently, researchers have also investigated the application of typical key performance indicators (KPIs) used in maintenance, e.g., MTTF (mean time to failure), MTBF (mean time between failures) and MTTR (mean time to repair). In their direct application of the indicators, Deepu [9] and Gao [5] analyze specially prepared scenarios that assume a certain frequency of machine failure, i.e., high, medium or low, to study the consequences of machine downtime and propose solutions to absorb the emerging disruptions to the schedule. With respect to the indirect use of the indicators, Kempa et al. [30,31] propose the use of the aforementioned reliability indicators indirectly for the purpose of estimating Weibull distribution parameters, while Rosmaini and Shahrul [32] in their study, couple the said indicators with statistical methods. These studies, however, suffer from the major drawback-the acquisition and use of the respective quantities is treated quite theoretically and lacks practical verification on real data of machine failure rates [9,31].
In addition to the methods referenced in the preceding paragraphs, a range of alternative failure prediction methods can be found in the literature. Jian et al. [20] propose accumulating failures to a single occurrence, describing it by means of the MTTR parameter and their original indicator, MBL (machine breakdown level). In turn, Rawat and Lad [33] determine failure rates from the analysis of machine load time distributions, and Baptista et al. in [34] use artificial neural networks for failure analysis.
Although constituting an interesting and important voice in the robust production scheduling studies, these models are associated with certain limitations. Their verification is often carried out on test data, which may not be the most accurate representation of actual problems in manufacturing systems. Secondly, the questions arise as to insufficient argumentation regarding the selection of the solutions. Consequently, the key aspect of implementing historical data in studies of the failure rate of machines is omitted.
The need to use real data on uncertainty factors is also emphasized by Davenport et al. [29] and Kalinowski et al. [35]. Only real knowledge on process disruptions can actually solve actual the problems that result from their occurrence. The issue was addressed in our previous work [36], where a model for the prediction of technological operation times in the framework of an intelligent job scheduling system was conceptualized. The study in question considered the impact of real processing time uncertainty on the production schedule and the developed intelligent module also implemented ARMA/ARIMA time series models, however, a problem of a different size was concerned and the verification was carried out for different production data. While such solutions can be found in the literature, the body of knowledge in the field still appears to lack proper depth [6].

Objectives
This paper formulates a predictive production scheduling process model in the job-shop environment under technological machine failure established with the help of Markov chains and ARIMA models. Our solutions predict the time of machine failure, as well as the time of repair, and constitute an alternative method to the models proposed in the literature. The objective function of our predictive production scheduling is to minimize the makespan, i.e., to produce a schedule with a minimum completion time of all jobs.

Basic Mathematical Notation of the Problem
Prior to formulating the problem of robust job scheduling under uncertainty in the job-shop system, we need to define the elements of the production process: • Set M is a set of m machines (workstations) processing jobs: • Set J is a set of n jobs (tasks) to process Processing job J i on machine M j constitutes an operation, which is called operation j of job i in the following. Therefore, it is necessary to define: • MO-a matrix of m columns and n rows describing the technology (the job order): where o ij -the order position of the operation j of job i, which is o ij = 0 when the job is not processed on the machine; and o ij = {1, . . . , m}, when it is.

•
Matrix PT-a matrix describing processing times of operations: where pt ij -the processing time of operations j of job i; for each o ij = 0, pt ij = 0.

of 19
• Set FT Ml of potential machine failure times: where f t Ml z -time to failure of the machine l; where z is a natural number representing the z-th machine failure.

•
Set TB Ml of time buffers to include in the nominal schedule (for machine l) to obtain a robust schedule: where tb Ml z -the size of the time buffer in the schedule at the failure time f t Ml z ; where for f t Ml z 0, tb Ml z 0.

Prediction of Failure and Machine Repair Times
In the paper, we analyze the system describing the shift on which a failure occurs. Additionally, the repair time [37] required for failure removal is analyzed. Let (Ω, F , P) be a probabilistic space: Ω-sample space (set of elementary events, outcomes), field F is a family of sample space Ω (set of all subsets of sample space Ω), P-probability measure (function that assigns each element from field F the probability, the value between 0 and 1), N-a set of natural numbers, R-a set of real numbers, S = {s 1 , s 2 , . . . , s k }-a set of possible shifts, k ∈ N, k < ∞-the number of possible shifts.

Definition 1.
A family {X t } t∈N of random variables X t : Ω → S for any t ∈ N is called a stochastic process with discrete time [38,39].
At any t ∈ N time, the system can take one of the possible states denoted as X t (ω) = x t ∈ S and P(X t (ω) = s i ) = p i (t) value means the probability that the system is in a state s i ∈ S,1 ≤ i ≤ k at a moment t ∈ N, and k i=1 p i (t) = 1.

Definition 2.
A stochastic process {X t } t∈N with a discrete time is called a Markov chain [38,39]. If for each n ∈ N , moments t 1 , t 2 , . . . , t n ∈ N satisfying the condition t 1 < t 2 < . . . < t n and any x 1 , x 2 , . . . , x n ∈ S, the equality: holds.
Below we assume t n = n ∈ N. If {X t } t∈N is a heterogeneous Markov chain, then for any t ∈ N and 1 ≤ i, j ≤ k, the value: is the transition probability from s i state at the moment t − 1 to s j state at moment t. From Markov property (7), the conditional probability distribution of the future process state depends only on the current state at moment t, regardless of the past. The matrix P(t) = p ij (t) 1≤i,j≤k is called the transition probabilities matrix at the moment t and the elements of the P(t) matrix satisfy the condition k j=1 p ij (t) = 1 for t ∈ N and 1 ≤ i ≤ k. Thus, if for a homogeneous Markov chain [39,40] the matrix P = p ij 1≤i,j≤k satisfies the condition k j=1 p ij = 1, 1 ≤ i ≤ k, then it is known as the one-step transition probability matrix. From the above, for a homogeneous Markov chain, the transition probability from s i state at t moment to the s j state at t + n moment is calculated as follows [38,40]: where p (n) ij 1≤i,j≤k = P n , n ∈ N is the transition probability matrix in n steps.

Definition 4.
If {X t } t∈N is a homogeneous Markov chain and there is a distribution π = (π 1 , π 2 , . . . , π k ) where π i ≥ 0, 1 ≤ i ≤ k and k i=1 π i = 1 satisfying the equation: then the distribution π is called the stationary distribution of the homogeneous Markov chain.
This property means that if at some n ∈ N moment the chain reaches a stationary distribution, then for each subsequent moment greater than n, the distribution will remain the same. To determine the stationary distribution, we solve Equation (10).
Let {x t } 0≤t≤n be the realization of Markov chain, where n i = #{t : x t = s i , 0 ≤ t ≤ n} is the number of moments for which the system was in s i state, 1 ≤ i ≤ k and k i=1 n i = n. The value n ij = # t : x t = s i , x t+1 = s j , 0 ≤ t ≤ n − 1 represents the number of transitions from the state s i to the state s j for 1 ≤ i, j ≤ k and k j=1 n ij = n i . We calculate the estimator of transition probability from s i state to s j state asp ij = n ij n i In this work, the goodness of fit test is used to verify Markov property χ 2 [40,41]. At the significance level α ∈ (0, 1), we create a working hypothesis: H 0 : P(X t = x|X t−1 = y, X t−2 = z) = P(X t = x|X t−1 = y) (the chain {X t } t∈N meets Markov property) and an alternative hypothesis: To verify the hypothesis H 0 , we calculate the test statistics: which has a χ 2 distribution with k 3 degrees of freedom and n ijv = # t : x t = s i , x t+1 = s j , x t+2 = s v , 0 ≤ t ≤ n − 2 is the number of transitions from state s i to state s j and next to state s v for 1 ≤ i, j, v ≤ k. The critical value is a quantile of order 1 − α for χ 2 distribution with k 3 degrees of freedom. We denote as χ 2 1 − α, k 3 . If χ 2 e < χ 2 1 − α, k 3 , then at the significance level α, there are no grounds for rejecting the working hypothesis H 0 . So, we can assume that the chain {X t } t∈N meets Markov property. When χ 2 e ≥ χ 2 1 − α, k 3 , then at the significance level α we reject the working hypothesis H 0 in favor of the alternative hypothesis. Thus, the chain {X t } t∈N does not meet Markov property.
An ARIMA model, which usually correlates historical values in a time series, is applied to forecast the repair time. The behavior of the considered time series can be predicted (i.e., forecast with appropriate probability) based on current observation and historical data (dataset). Let {rt t } t∈N denote the sequence of times needed to repair a plant. Because the times needed do remove the failures can take only positive values, the variance-stabilizing transformation can be applied. The series {ε t } t∈N is identified using ARIMA(p, r, q) models, p, r, q ∈ N (auto-regressive integrated moving average) [42][43][44][45]. In this paper, the logarithm of repair time is modelled as follows: where { t } t∈N is a sequence of independent random variables with distribution N 0, σ 2 . To estimate the integration degree, Augmented Dickey-Fuller (ADF) and the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) [42,46,47] tests are applied. The Markov chains and ARIMA models are implemented to determine the values of elements of sets FT Ml and TB Ml . The analysis of historical machine failure data leads to determining important failure parameters, which can subsequently help establish buffer time periods in the predictive production scheduling method.

Historical Data
In order to verify the solutions proposed in this publication, a process of robust job scheduling is performed on the production data and on historical data of technological machine failure. The production data describes the execution of 9 production jobs processed by 12 machines, constituting a manufacturing cell ( Table 1). The parts are produced in batches of 50 elements (b = 50) and the setup times of individual operations are not taken into account in the production scheduling process (uncertainty of setup times is a different factor and requires additional research). Therefore: where b-quantity of elements in the production batch, to ij -operation time. The data describing the failure rate of machines in the scheduled production process have been obtained from the computer records of machine operation from a maintenance department ( Table 2). The collected data describe the failure rate of six technological machines that are crucial for the performed production process. The numbers of observations are as follows: The nominal and robust schedules are subsequently built based on the data presented above. This, however, must be preceded by the prediction of failure parameters with the application of Markov chain and ARIMA models, which is described in the next section.

Prediction of Machine Failure Parameters
The failure prediction process is performed using appropriate scripts formulated in the RStudio computing environment [48] that enable the analysis in the range described in Section 3.3.
Modelling the machine failure rates using the Markov chain (containing information on changes in production) is performed with the use of the markovchain library. Initially, the collected empirical data is verified to check whether they fulfil the properties of the Markov process. The analyses confirm that the data from the analyzed machines meet the required properties, which is further evidenced by the p-value index ( Table 3). The p-value is the probability of obtaining hypothesis test results as extreme as the observed results, assuming that the null hypothesis is correct (data chain has a Markov property). In the next stage of the machine failure prediction analysis, the transition rate matrices are generated from the collected data. From the obtained information, we determine, at a given probability level, the occurrence of subsequent chain elements, which in this study is the probability of machine failure occurrence at subsequent production shifts (Table A1).
For clarity of presentation, the results from calculations are given in the form of transition diagrams (Figure 1), which additionally enable determining the probability of failure not only on subsequent shifts but also during the current shift (the arrow returning to the node). In Figure 1, knots represent shifts and the probability of machine failure during a given shift is given at the beginning of each arrow (next to the knot), e.g., for machine M 6 , the probability that the machine failure will occur after shift 1 during shift 2 is 0.593. Appl. Sci. 2020, 10, x FOR PEER REVIEW 9 of 19 The second key parameter of failure is its time. To this end, the forecast package is used, which enables identifying the elements of ARIMA time series. As a result, the predicted machine failure repair times are determined. Before forecasting, each machine is tested to verify whether the process is stationary, also the component models are established (autoregression, moving average and the integration). In the subsequent step, future repair times are predicted. Due to the fact that in ARIMA models, the forecasts may also take negative values, the collected observations have been first subjected to variance-stabilizing transformation, and after the prediction, the process is completed and their original sets of values are returned. The results from the model identifying exemplary predicted times for the 5 subsequent steps of the time series are presented in Table 4. The second key parameter of failure is its time. To this end, the forecast package is used, which enables identifying the elements of ARIMA time series. As a result, the predicted machine failure repair times are determined. Before forecasting, each machine is tested to verify whether the process is stationary, also the component models are established (autoregression, moving average and the integration). In the subsequent step, future repair times are predicted. Due to the fact that in ARIMA models, the forecasts may also take negative values, the collected observations have been first subjected to variance-stabilizing transformation, and after the prediction, the process is completed and their original sets of values are returned. The results from the model identifying exemplary predicted times for the 5 subsequent steps of the time series are presented in Table 4. The data below display the disparity between individual machine repair times. Each machine is coupled with a different ARIMA model. From the set of possible ARIMA models, we select a model that has a smaller AIC (Akaike Information Criterion) value. The analytical process is carried out with the exclusive use of autoregression (Machine M 1 ), or the moving average (Machine M 8 ), or a combination thereof (Machine M 2 , M 6 , M 7 ). In a single case (Machine M 2 ), time variability is a series of independent random variables; therefore, the forecast repair times are established on the basis of mean observations.
The results of the prediction of machine failure parameters are perfectly applicable to robust production scheduling. Expressing the time of failure through production shifts allows us to determine the intervals in which machine failures are likely to occur. In turn, the forecasted repair times could be further employed to the determination of machine inspection and maintenance times. Therefore, in the next stage, the obtained analysis results are used to formulate a robust production schedule, whose effectiveness is subsequently verified.

Production Process Modelling and Scheduling
Before the obtained prediction results could be subjected to further processing, nominal schedules are generated from the real data. Let us assume that the product is manufactured in batches of 50 pieces and the objective function of the schedule is to minimize the makespan (C max ).
The schedules are developed using LiSA (A Library of Scheduling Algorithms) software [49], for the analysis of scheduling problems in various environments. The production data serve to represent the production system: a set of machines M and jobs J, the technology matrix MO and the matrix of processing times PT. To test the alternative versions of scheduling, the choice of the next operation is determined by two dispatching rules [50]: • LPT (longest processing time) • SPT (shortest processing time) The robustness of the production schedules is to be provided by the inclusion of the results from the predictions of failure parameters using the data describing the states of production shifts set S and predicted repair times rt t . As a result, we have managed to determine the elements from the set of predicted machine failure times FT Ml and the service time buffer set TB Ml . As noted in the introduction, the data obtained for strategic machines are analyzed from the perspective of executed production processes, hence the discrepancies in the designations in the technology records and the schedule. All the data that serve to generate the robust schedule are presented in Table 5. Since it is built on the data above, the obtained schedule is robust to machine failure disturbances. The procedure for generating the robust schedule is rather straightforward: service time buffers TB Ml are implemented into the nominal schedule in the slots indicated by the set of machine failure times FT Ml . The time-to-failure is counted only for the machines processing jobs (idle time was disregarded). In the case when a service time buffer is required during the operation, any interfering operation was shifted right in the order of jobs.

Evaluation Criteria
To verify the effectiveness of the robust scheduling solutions, as well as for the sake of comparative analysis against the nominal schedules, the following assessment criteria are applied: • makespan C max -total production time, • mean completion time C given by: where C i -the completion time of job i. • mean flow time F given by: where F i -the flow time of job i. • the number of critical operations Y K is derived from: where Y K -the number of critical operations, tz ij -the completion time of operation o ij (current), tr ij+1 -the start time of operation o ij + 1 (subsequent).
The verification of the obtained schedules is performed during the online stage (production execution), modelled with the Enterprise Dynamics software in a series of simulation tests. The computations serve to determine total completion times of production jobs under strategic machinery failure. The modelling tool used in the study allows detecting machine failure times by setting the MTTF and MTTR indicators and selected probability distributions ( Table 6). The MTTF values are specified for the uniform probability distribution (i.e., the machine failure can occur at any time-from the start of the job on the machine until its completion). On the other hand, the MTTR values are determined using the Weibull distribution, obtained for machine repair times from the Cullen-Frey graph [6]. Twenty-five simulations of the production process are performed for each of the LPT or SPT schedules. The indicators employed in the assessment of the results from simulations are: • Increase of completion time of all jobs ∆C max given by: where ∆C max -increase of completion time of all jobs, C max -nominal schedule makespan, C max -actual (executed) schedule makespan.

•
Relative increase of makespan E C max given by: where E C max -relative increase of makespan, C max -nominal schedule makespan, C max -actual (executed) schedule makespan.

Experimental Results
The first of the verification objectives is to compare the nominal and robust production schedules in terms of evaluation criteria. The values of the evaluation indicators of the schedules have been determined and are summarized below (Table 7). From the presented data, it can be seen that the implementation of service time buffers increases the completion times of all jobs. As a consequence, in each of the analyzed cases, one additional shift is required to complete the production process. This effect is not at all unexpected, given that incorporating service time buffers is inseparably connected with elongation. It should be noted, however, that in the robust schedule the time spent in the production system is not extended owing to the fact that the mean flow time is subject to slight elongation. A further indicator of robustness that is of great importance in scheduling is the number of critical operations. Scheduling should minimize its value because the stability of the executed process is compromised with the rising number of critical operations. In the analyzed example, the number of critical operations is considered in relation to individual jobs ( ) and machines ( ). The robust scheduling results with respect to the number of critical operations are presented in Table 8. The incorporation of service time buffers is shown to have a positive effect on the considered parameters. The number of critical operations is reduced by up to 20%. This confirms the legitimacy of implementing service time buffers, which generate additional space in the schedule and thus can prove to be beneficial in the event of machinery failure or other process disruptions. A further indicator of robustness that is of great importance in scheduling is the number of critical operations. Scheduling should minimize its value because the stability of the executed process is compromised with the rising number of critical operations. In the analyzed example, the number of critical operations is considered in relation to individual jobs (Y KJ ) and machines (Y KM ). The robust scheduling results with respect to the number of critical operations are presented in Table 8. The incorporation of service time buffers is shown to have a positive effect on the considered parameters. The number of critical operations is reduced by up to 20%. This confirms the legitimacy of implementing service time buffers, which generate additional space in the schedule and thus can prove to be beneficial in the event of machinery failure or other process disruptions.
Simulation tests are conducted to indicate which of the schedules features a makespan closer to the simulated completion time of all jobs. The tests follow the procedure presented in the preceding section and their results are given below, in Tables 9 and 10. The schedules generated with the LPT and SPT dispatching rules are shown to outperform the nominal schedule. Their accuracy of predictions is closer to the production data established in simulations. At a closer investigation, the LPT schedules (Table 9) exhibit good compliance of robust and simulated makespans. The schedule is robust for an average of 1.56 h longer than the simulated process; however, considering the nominal schedule, the completion time of all jobs is on average 4.65 h shorter. The comparable makespan length of the robust schedule and the executed production schedule is further confirmed by the mean value of indicator E C max , which amounts to 1.03 for the robust schedule, and 0.91 for the nominal schedule.
A similarity of a comparable magnitude is also shown to occur in the production process simulations conducted according to the SPT schedules (Table 10). The mean makespan of the nominal schedule is −5.75 h, while of the robust schedule 2.67 h. In the same case, the mean relative increase is 0.89 for the nominal schedule and 1.05 for the robust schedule.
Nevertheless, it should be noted that in several simulations (for both the LPT and SPT rules), the nominal schedules display a closer resemblance to the executed production process; still, the robust and the executed schedules also show a good fit (e.g., simulation 2 for the LPT rule, or simulation 18 for the SPT rule).
To summarize, the data obtained in the study clearly indicate that the schedule with service time buffers achieves a closer resemblance to the simulated makespan. The schedules generated with the LPT and SPT dispatching rules are shown to outperform the nominal schedule. Their accuracy of predictions is closer to the production data established in simulations. At a closer investigation, the LPT schedules (Table 9) exhibit good compliance of robust and simulated makespans. The schedule is robust for an average of 1.56 h longer than the simulated process; however, considering the nominal schedule, the completion time of all jobs is on average 4.65 h shorter. The comparable makespan length of the robust schedule and the executed production schedule is further confirmed by the mean value of indicator , which amounts to 1.03 for the robust schedule, and 0.91 for the nominal schedule.
A similarity of a comparable magnitude is also shown to occur in the production process simulations conducted according to the SPT schedules ( Table 10). The mean makespan of the nominal schedule is −5.75 h, while of the robust schedule 2.67 h. In the same case, the mean relative increase is 0.89 for the nominal schedule and 1.05 for the robust schedule.
Nevertheless, it should be noted that in several simulations (for both the LPT and SPT rules), the nominal schedules display a closer resemblance to the executed production process; still, the robust and the executed schedules also show a good fit (e.g., simulation 2 for the LPT rule, or simulation 18 for the SPT rule).
To summarize, the data obtained in the study clearly indicate that the schedule with service time buffers achieves a closer resemblance to the simulated makespan. Figures 3 and 4 display the results for makespan increase indicators, which provide further evidence confirming the legitimacy of our solutions. The proximity of the robust schedules to the simulated schedules is again highlighted by their being situated close to the dashed line.   Robust production schedules generated with the application of our solutions determine makespans closer to the simulated completion times of all jobs in the simulated production conditions under technological machinery failure uncertainty. This is evidenced by several indications, e.g., the fact that for the robust schedules, the values of ∆ max tend to be close to 0 (Figure 3), whereas in the case of around 1 (Figure 4). Values 0 and 1 of the considered indicators denote igh compliance of the robust schedule with the production execution (simulation).

Summary and Conclusions
The execution of production processes is associated with the occurrence of various uncertainty factors. Disruptions generate problems that may have a marked effect on production schedules. Therefore, more effort is required in developing techniques and methods that affirm the relevance of uncertainty factors in manufacturing and propose viable solutions. Robust scheduling exhibits the required potential to cope with disruptions and, thus, should be studied further.
In this investigation, the aim was to design a robust production scheduling method with the implementation of Markov chain theory and ARIMA models that will provide for the negative effects of technological machinery failure. The analyses reveal that the inclusion of machine failure in the production schedule results in the extension of the performance indicators, mean flow time, mean job completion time, as well as the central criterion describing the performance of the production system-the completion time of all jobs (makespan). However, the elongation remains within the reasonable limits given that the production is carried out according to failure-inclusive schedules. The simulations evidence that the robust schedules bear a closer similarity to the simulated production process than their nominal equivalents. In other words, the proposed model generates high-accuracy makespan while increasing the robustness and stability of the schedule.
To extend our research in the future, we intend to develop improved models that will: provide for the management of other uncertainty factors in production scheduling (e.g., disruptions related to transport, availability of materials or employee absence), enable reactive scheduling of production jobs or extend the versatility of the proposed solutions over other manufacturing systems. Our current findings and methodologies should make a noteworthy contribution to the theory of production scheduling, as well as appeal to practitioners representing various manufacturing industries and different-sized enterprises.  Robust production schedules generated with the application of our solutions determine makespans closer to the simulated completion times of all jobs in the simulated production conditions under technological machinery failure uncertainty. This is evidenced by several indications, e.g., the fact that for the robust schedules, the values of ∆C max tend to be close to 0 (Figure 3), whereas in the case of E C max around 1 (Figure 4). Values 0 and 1 of the considered indicators denote igh compliance of the robust schedule with the production execution (simulation).

Summary and Conclusions
The execution of production processes is associated with the occurrence of various uncertainty factors. Disruptions generate problems that may have a marked effect on production schedules. Therefore, more effort is required in developing techniques and methods that affirm the relevance of uncertainty factors in manufacturing and propose viable solutions. Robust scheduling exhibits the required potential to cope with disruptions and, thus, should be studied further.
In this investigation, the aim was to design a robust production scheduling method with the implementation of Markov chain theory and ARIMA models that will provide for the negative effects of technological machinery failure. The analyses reveal that the inclusion of machine failure in the production schedule results in the extension of the performance indicators, mean flow time, mean job completion time, as well as the central criterion describing the performance of the production system-the completion time of all jobs (makespan). However, the elongation remains within the reasonable limits given that the production is carried out according to failure-inclusive schedules. The simulations evidence that the robust schedules bear a closer similarity to the simulated production process than their nominal equivalents. In other words, the proposed model generates high-accuracy makespan while increasing the robustness and stability of the schedule.
To extend our research in the future, we intend to develop improved models that will: provide for the management of other uncertainty factors in production scheduling (e.g., disruptions related to transport, availability of materials or employee absence), enable reactive scheduling of production jobs or extend the versatility of the proposed solutions over other manufacturing systems. Our current findings and methodologies should make a noteworthy contribution to the theory of production scheduling, as well as appeal to practitioners representing various manufacturing industries and different-sized enterprises.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
In the table below, the rows and columns describe individual production shifts. The probability of the machine failure during a given shift is derived from the matrix by setting the shift number in a given row against an appropriate column, e.g., for Machine M 6 , the probability that the machine failure will occur after shift 1 during shift 2 is 0.593 (first row, second column). The procedure of machine failure probability calculation is described in detail in Section 3.3 (Definition 4).