Optimizing a Multi-State Cold-Standby System with Multiple Vacations in the Repair and Loss of Units

: A complex multi-state redundant system with preventive maintenance subject to multiple events is considered. The online unit can undergo several types of failure: both internal and those provoked by external shocks. Multiple degradation levels are assumed as both internal and external. Degradation levels are observed by random inspections and, if they are major, the unit goes to a repair facility where preventive maintenance is carried out. This repair facility is composed of a single repairperson governed by a multiple vacation policy. This policy is set up according to the operational number of units. Two types of task can be performed by the repairperson, corrective repair and preventive maintenance. The times embedded in the system are phase type distributed and the model is built by using Markovian Arrival Processes with marked arrivals. Multiple performance measures besides the transient and stationary distribution are worked out through matrix-analytic methods. This methodology enables us to express the main results and the global development in a matrix-algorithmic form. To optimize the model, costs and rewards are included. A numerical example shows the versatility of the model.


Introduction
Redundant systems and preventive maintenance are of fundamental importance in ensuring reliability, preventing system failures and reducing costs. These questions, therefore, are of considerable research interest.
The occurrence of total, unexpected system failure can provoke severe damage and major financial loss. To avoid such an outcome, various reliability-enhancing methods can be applied, chief among which are redundancy and preventive maintenance. In this respect, cold, hot and warm redundant standby and k-out-of-n systems have been proposed. Among researchers who have addressed these questions, Levitin et al. [1] considered an optimal standby element sequencing problem (SESP) for 1-out-of-N: G heterogeneous warm-standby systems, while Zhai et al. [2] constructed a multi-value decision diagram with which to analyse a demand-based warm standby system. In related papers, Cha et al. [3] considered preventive maintenance for items operating in a random environment subjected to a shock Poisson process, Levitin et al. [4] evaluated the probability of mission success given an arbitrary redundancy level, and Osaki et al. [5] analysed the behaviour of a two-unit standby redundant system. Preventive maintenance enhances system reliability and performance, reduces costs, for both repairable and non-repairable systems, and decreases the probability of sudden equipment failure. Various maintenance systems were studied by [6,7] who developed a new model for the hybrid preventive maintenance of systems with partially observable degradation. Levitin et al. (2021) [8] modelled the (time-consuming) procedure of task transfer, in an event transition-based reliability analysis of standby systems in which preventive replacements are performed according to a predetermined schedule. The aim of this approach is to optimise preventive replacement scheduling and hence to maximise

The Assumptions
The system follows the following assumptions. Assumption 1. The internal performance time is PH distributed with representation (α, T), and with order m (number of internal stages). The internal failure probability depends on the states. The column vectors T 0 r and T 0 nr contains the probabilities of repairable and non-repairable failures, respectively. Assumption 2. The internal performance of the online unit is multi-state where the n 1 first units are minor and the rest are major according to damage. Assumption 3. The external events occur according to a PH-renewal process where the time between two consecutive shocks is a PH distribution with representation (γ, L), with order t.
Assumption 4. An external shock can provoke a total non-repairable failure of the online unit with a probability equal to ω 0 . Assumption 5. After an external shock the internal performance state can undergo a modification. This modification between any two internal states occurs according to the transition probability matrix W. The column vectors W 0 r and W 0 nr contains the probabilities of repairable and non-repairable failures respectively provoked by an external shock.
Mathematics 2021, 9,913 5 of 29 Assumption 6. The time between two consecutive random inspections is PH distributed with representation (η, M), with order ε. Assumption 7. The vacation time is distributed following a PH distribution with representation (v, V), with order υ.
Assumption 8. The corrective repair time is PH distributed with representation (β 1 , S 1 ), with order z 1 .
Assumption 9. The preventive maintenance time is PH distributed with representation (β 2 , S 2 ), with order z 2 .
The behaviour of the system is shown in Figure 1, for inspection and repairable failure, Figure 2 for non-repairable failure, and Figure 3 for the vacation policy. , v V , with order υ.
Assumption 8. The corrective repair time is PH distributed with representation ( ) 1 1 , β S , with order z1. Assumption 9. The preventive maintenance time is PH distributed with representation ( ) 2 2 , β S , with order z2.
The behaviour of the system is shown in Figure 1, for inspection and repairable failure, Figure 2 for non-repairable failure, and Figure 3 for the vacation policy.

Modelling the System. The Markovian Arrival Process with Marked Arrivals
The system is governed by a Markov process vector in discrete time. In this section the state space is described and, to model the proposed complex system, the behaviour of the online unit and of the repair facility is developed separately.

The State-Space
The state-space is composed of macro-states and it is denoted by , where U k contains the phases when there are k units in the system. In turn, these macrostates are partitioned as follows

Modelling the System. The Markovian Arrival Process with Marked Arrivals
The system is governed by a Markov process vector in discrete time. In this section the state space is described and, to model the proposed complex system, the behaviour of the online unit and of the repair facility is developed separately.

The State-Space
The state-space is composed of macro-states and it is denoted by S = U n , U n−1 , . . . , U 1 , where U k contains the phases when there are k units in the system. In turn, these macrostates are partitioned as follows where E k,x s contains the phases when there are k units in the system and s of them are in the repair facility and the superscript x indicates if the repairperson in on vacation (v) or not (nv). Initially the repairperson begins to operate the first time that he comes back from vacation and the system has at least N = k − R + 1 units in the repair facility. He remains working until N − 1 units are in the repair facility. At this moment the repairperson goes on vacation. In any case, the order of the units in the repair facility has to be saved in memory, and there are two types of repair, corrective and preventive maintenance. For this reason, the macro-state E k,x s is composed of the first level of macro-states E k,x i 1 ,...,i s . These macro-states contain the phases when there are k units in the system, with s of them in the repair facility, and the type of repair is given by the ordered sequence i 1 , . . . , i s . The values of i l are equal to 1 or 2 if the unit is in corrective repair or preventive maintenance, respectively.
When the number of units in the system is R -1 units, then the repairperson occupies his place work immediately. The inspection time is restarted each time that one unit occupies the online place.
The phase (k, s; i, j, u, m, r) indicates that there are k units in the system, with s in the repair facility; the internal performance of the online unit is in state i, the external shock time is in state j, the cumulative damage caused by external shocks is given by u, m is the current phase of the inspection time and r is the corrective repair/preventive maintenance phase for the unit currently being attended to in the repair facility. If the repairperson is taking a vacation, the phase is indicated by v.
The order of these macro-states is as follows:

Modelling the Online Unit
The online unit can undergo different types of event at any time. These are noted and defined as: A: Internal repairable failure B: Major revision C: Non-repairable failure O: No events Two of them are described below, and the rest are given in Appendix A. The elements of auxiliary matrices U 1 and U 2 are defined as Throughout this work the symbol ⊗ denotes the Kronecker product and, given a matrix A, we denote this as A 0 to the column vector A 0 = e − Ae, e being a column vector of units with appropriate order.

No Events at a Certain Time (O)
We assume that the online unit is operational and at this time it continues working. This occurs because of different situations:

•
The internal performance continues in the same phase or changes to another, equally operational state. There is no external shock (T ⊗ L), and no inspection takes place (M). The matrix that governs this transition for the online unit is given by T ⊗ L ⊗ M. • The online undergoes an external shock but total failure does not occur (L 0 γ 1 − ω 0 ). This external shock might modify the internal performance but does not produce internal failure (TW). No inspection takes place (M). The matrix is TW ⊗ L 0 γ 1 − ω 0 ⊗ M. • An inspection takes place and the time preceding the next one begins (M 0 η). The inspector observes that the online unit does not need preventive maintenance and no external shock occurs (U 1 T ⊗ L). The matrix is An inspection takes place and the time preceding the next one begins (M 0 η). One external shock also takes place without total failure (L 0 γ 1 − ω 0 ). This shock provokes a change in the internal performance without failure and the inspection observes minor damage (U 1 TW). This matrix is Therefore, the matrix that governs this transition for the online unit is given by The online unit is assumed to be operational and at the next time point a non-repairable failure occurs, because: • An internal non-repairable failure occurs with no external shock, T 0 nr α ⊗ L. • An external shock occurs, but does not provoke total failure. This shock provokes a non-repairable internal failure or, irrespective of the shock, the online unit may experience a non-repairable internal failure. The matrix is T 0 An external shock provokes total failure. In this case the internal behaviour is irrelevant. The matrix is eα ⊗ L 0 γω 0 .
This transition is independent of the inspection time. After the online unit experiences a non-repairable failure, the online place is occupied by a substitute, identical unit. Then, the matrix is given by If only one unit is operational and online (i.e., all others are under repair), this unit experiences a non-repairable failure and no repair occurs, no immediate substitution can be made and therefore the system does not restart. The matrix is given by

The Markovian Arrival Process with Marked Arrivals (MMAP)
The behaviour of the system is governed by a MMAP. The representation of this MMAP is given from the types of event shown below: A The representation of the MMAP is The transition probability matrix associated to the embedded Markov chain from the MMAP is given by Two matrices D Y are described in the next section. The rest are given in Appendix B.
The Matrices D A and D B The matrices D A and D B govern the transition when a repairable failure or a major inspection takes place, respectively. These matrices are composed of matrix blocks that contain the transitions between macro-states U k . This is a diagonal matrix block given that the number of units in the system does not change in this transition. The matrix D Y k contains the transition probabilities when there are k units in the system and the event Y occurs for Y = A or B and k = 1, . . . , n. Then, C: Non-repairable failure (default without D) D: The repairperson resumes to work (default without A, B, C) AD: Internal repairable failure and the repairperson resumes work BD: Major revision and the repairperson resumes work CD: Non-repairable failure and the repairperson resumes work NS: New system O : No events   The representation of the MMAP is (  )   , , , ,  ,  ,  , , The transition probability matrix associated to the embedded Markov chain from the MMAP is given by Two matrices D Y are described in the next section. The rest are given in Appendix B.
The matrices D A and D B The matrices D A and D B govern the transition when a repairable failure or a major inspection takes place, respectively. These matrices are composed of matrix blocks that contain the transitions between macro-states U k . This is a diagonal matrix block given that the number of units in the system does not change in this transition. The matrix Y k D contains the transition probabilities when there are k units in the system and the event Y occurs for Y = A or B and k = 1,…, n. Then, These blocks are composed of further blocks.
• If the number of units is less than R−1, the repairperson is always in his workplace.
, , , These blocks are composed of further blocks. The transition probability matrix associated to the embedded Markov chain from the MMAP is given by Two matrices D Y are described in the next section. The rest are given in Appendix B.
The matrices D A and D B The matrices D A and D B govern the transition when a repairable failure or a major inspection takes place, respectively. These matrices are composed of matrix blocks that contain the transitions between macro-states U k . This is a diagonal matrix block given that the number of units in the system does not change in this transition. The matrix Y k D contains the transition probabilities when there are k units in the system and the event Y occurs for Y = A or B and k = 1,…, n. Then, These blocks are composed of further blocks.
• If the number of units is less than R−1, the repairperson is always in his workplace.
, , , The block D Y,k,nv i,j contains the transition, when there are k units in the system, from i units in the repair facility to j (a type event Y occurs) and the repairperson is in his workplace. For instance, the cases D A,k,nv 01 and D B,k,nv 01 (transition E k,nv 0 → E k,nv 1 for type A and B respectively) are analyzed.
In both cases, there are k units in the system and none of these is in the repair facility (all operational). The online unit goes to the repair facility if it undergoes an internal repairable failure (H A ) or a major inspection (H B ). In both cases a new unit will occupy the online place if the number of units in the system is greater than one. If the event is a repairable failure, then the unit will begin the repair given that the repairperson is not on vacation (β 1 ). If the event is a major inspection, the initial distribution for the preventive maintenance would be β 2 .

•
If the number of units is greater or equal than R, the repairperson can be on vacation or not. If the repairperson returns and there are less than R operational units then he remains at his workplace. Given that these events A and B occur when a repairable or major inspection occurs (without returning to work) then, for k = R, . . . , n (N = k − R + 1, the limit of the number of units in the repair facility for the repairperson to remain): pairable failure, then the unit will begin the repair given that the repairperson is not on vacation (β1). If the event is a major inspection, the initial distribution for the preventive maintenance would be β2.
• If the number of units is greater or equal than R, the repairperson can be on vacation or not. If the repairperson returns and there are less than R operational units then he remains at his workplace. Given that these events A and B occur when a repairable or major inspection occurs (without returning to work) then, for k = R,…, n ( 1 N k R = − + , the limit of the number of units in the repair facility for the repairperson to remain): , This matrix is partitioned into two great matrix blocks depending on the transition between macro states; continues on vacation and continues in the repair facility.
The rest of matrices for this matrix block are as follows.

Measures
Multiple interesting measures in transient and stationary regime can be worked out and are described in this section.

The Transient and the Stationary Distribution
The transient distribution is determined by the initial distribution and the transition probability matrix of the vector Markov process given in Section 3.3.
Initially the online unit is new and the inspection time begins. Then, the initial distribution of the Markov process is φ = [α ⊗ γ st ⊗ η, 0] where γ st is the stationary distribution of the phase-type renewal process with transition probability matrix L + L 0 γ.
The probability of occupying the macro-state E k,a s at time ν is worked out by matrix blocks as p ν where I k,a s indicates the range for the corresponding states. Evidently, p ν is the transient distribution at time ν.
To calculate the stationary distribution in a matrix-algorithmic form, we have partitioned the matrix D for the transitions between the macro-states U j into the following blocks, The stationary distribution π verifies the balance equations πD = π and the normalization equation πe = 1. This vector is partitioned into the macro-states U j , j units in the system, then, π = {π n , π n−1 , . . . , π 1 } for the macro-states U n , . . . , U 1 , respectively.
The solution of this matrix system is π j = π 1 R j ; j = 2, . . . , n, being R j = R j+1 G j+1,j = G 1n G n,n−1 · · · G j+1,j ; j = 2, . . . , n−1, R n = G 1,n and The transition probability vector for the macro-state U 1 can be worked out from the normalization condition and one balanced equation as where * is the corresponding matrix without the first column.
From the stationary distribution and considering the macro-states, multiple proportional time measures can be defined:

•
Proportional time that the system has k units: π U k . • Proportional time that the repairperson is in the workplace: The stationary distribution π verifies the balance equations = πD π and the normalization equation . This vector is partitioned into the macro-states U j , j units in the system, then, for the macro-states U n , …, U 1 , respectively.
The solution of this matrix system is ; j = 2,…,n, being The transition probability vector for the macro-state 1 U can be worked out from the normalization condition and one balanced equation as where * is the corresponding matrix without the first column. From the stationary distribution and considering the macro-states, multiple proportional time measures can be defined: • Proportional time that the system has k units: • Proportional time that the repairperson is in the workplace: , , • Proportional time that the repairperson is on vacation: Proportional time that the repairperson is working: Proportional time that the repairperson is idle: • Proportional time that the repairperson is on vacation: = π π π π  for the macro-states U n , …, U 1 , respectively.
The solution of this matrix system is 1 j j = π π R ; j = 2,…,n, being The transition probability vector for the macro-state 1 U can be worked out from the normalization condition and one balanced equation as where * is the corresponding matrix without the first column. From the stationary distribution and considering the macro-states, multiple proportional time measures can be defined: • Proportional time that the system has k units: • Proportional time that the repairperson is in the workplace: , , • Proportional time that the repairperson is on vacation: Proportional time that the repairperson is working: Proportional time that the repairperson is idle: = π π π π  for the macro-states U n , …, U 1 , respectively.
The solution of this matrix system is 1 j j = π π R ; j = 2,…,n, bein The transition probability vector for the macro-state 1 U can be worked out from th normalization condition and one balanced equation as where * is the corresponding matrix without the first column. From the stationary distribution and considering the macro-states, multiple propor tional time measures can be defined: • Proportional time that the system has k units: • Proportional time that the repairperson is in the workplace: , , • Proportional time that the repairperson is on vacation: Proportional time that the repairperson is working: Proportional time that the repairperson is idle: nv • Proportional time that the repairperson is working: = π π π π  for the macro-states U n , …, U 1 , respectively.
The solution of this matrix system is 1 j j = π π R ; j = 2,…,n, being The transition probability vector for the macro-state 1 U can be worked out from the normalization condition and one balanced equation as where * is the corresponding matrix without the first column. From the stationary distribution and considering the macro-states, multiple proportional time measures can be defined:

•
Proportional time that the system has k units: • Proportional time that the repairperson is in the workplace: , , • Proportional time that the repairperson is on vacation: Proportional time that the repairperson is working: Proportional time that the repairperson is idle: where * is the corresponding matrix without the first column. From the stationary distribution and considering the macro-states, multiple proportional time measures can be defined: • Proportional time that the system has k units: • Proportional time that the repairperson is in the workplace: , , • Proportional time that the repairperson is on vacation: Proportional time that the repairperson is working: Proportional time that the repairperson is idle:

Availability and Mean Times
It is interesting to calculate the availability of the system, the mean time in each macro-state and the mean operational time. This has been summed up in Table 1 in both regimes, transient and stationary.
where * is the corresponding matrix without the first column. From the stationary distribution and considering the macro-states, multiple proportional time measures can be defined: • Proportional time that the system has k units: • Proportional time that the repairperson is in the workplace: , , • Proportional time that the repairperson is on vacation: Proportional time that the repairperson is working: Proportional time that the repairperson is idle:

Availability and Mean Times
It is interesting to calculate the availability of the system, the mean time in each macro-state and the mean operational time. This has been summed up in Table 1 in both regimes, transient and stationary.
where * is the corresponding matrix without the first column. From the stationary distribution and considering the macro-states, multiple prop tional time measures can be defined: • Proportional time that the system has k units: • Proportional time that the repairperson is in the workplace: , , • Proportional time that the repairperson is on vacation: Proportional time that the repairperson is working: Proportional time that the repairperson is idle:

Availability and Mean Times
It is interesting to calculate the availability of the system, the mean time in e macro-state and the mean operational time. This has been summed up in Table 1 in b regimes, transient and stationary. w .

Availability and Mean Times
It is interesting to calculate the availability of the system, the mean time in each macrostate and the mean operational time. This has been summed up in Table 1 in both regimes, transient and stationary. Availability

Expected Number of Events
The expected number of events up to time ν is determined using the Markovian Arrival Process with Marked arrivals developed in Section 3.3. If the event considered is denoted by Y then the corresponding expected number of events is given by For Y = A, B, C, D, AD, BD, CD, NS. This value in stationary regime is Λ Y = πD Y e. Another mean number of events can be calculated as follows. The mean number of non-repairable failures up to time ν is

Mean Number of Repairable Failures
This value in the stationary case is Λ nr = π D C + D CD e.

Mean Number of Times That the Repairperson Resumes to Work
The mean number that the repairperson resumes and remains in his workplace up to a certain time is given by In the stationary case this is Λ rejoined = π D D + D AD + D BD + D CD e.

Mean Number of Times That the Repairperson Resumes and Begins a New Period of Vacation
The mean number that the repairperson resumes and begins a new period of vacation up to a certain time is given by where Q is a matrix described in Appendix C. In the stationary case it is Λ r−b = πQe.

Mean Number of New Systems
When the system is composed of only one unit and a non-repairable failure occurs, the system is restarted with n new units. The mean number of new systems up to time ν is This measure in stationary case is Λ NS = πD NS e.

Rewards and Costs
To analyze the effectiveness of the model from an economic point of view, costs and rewards have been taken into account. A net profit vector associated to the state-space is built. Previously, multiple values are introduced: B: Gross profit per unit of time if the system is operational. c 0 : expected cost per unit of time depending on the operational phase while the system is operational. cr 1 : expected corrective repair cost per unit of time depending on the repair phase. cr 2 : expected preventive maintenance cost per unit of time for a unit that was observed with major damage depending on the preventive maintenance phase. H: repairperson cost per unit of time while the repairperson in idle. C: loss per unit of time while the system is not operational G: fixed cost associated to each return of the repairperson (independently of if he stays or not). fcr: fixed cost each time that the online unit undergoes a repairable failure from the online unit. fmi: fixed cost each time that the online unit undergoes a major inspection. fnu: cost for a new unit (n·fnu cost of a new system).

Net Profit Vector
When the system occupies a determined state, a net profit value is produced. Costs and rewards from the online unit and the cost provoked by the repairperson have been taken into account to build the net profit vector.

Online Unit
If only the online unit is considered when the system visits the macro-state E k,nv s , a net reward for the phases of this macro-state is worked out. The profit net vector for the online unit if the repairperson is in his workplace (E k,nv s ) is for k = 1, . . . , n,  Therefore, the net profit vector corresponding to the online unit and the repair facility for the global state space is given by . . , n.

Expected Net Profits and Total Net Profit
Net reward measures are worked out, in transient and stationary regimes, to analyze the effectiveness of the system from an economic point of view.

Expected Net Profit from the Online Unit Up to Time ν
The expected net profit up to time ν by considering only the online unit is In stationary regime this is given by Φ w_s = π · nr.

Expected Cost from Corrective Repair and Preventive Maintenance
The expected cost because of corrective repair and preventive maintenance up to time ν is calculated. This is respectively where mc cr is the vector nc with cr 2 = 0 z 2 and mc pm is the vector nc with cr 1 = 0 z 1 , being 0 a a column vector of 0s with order a. If the stationary regime is considered, then Φ cr_s = π · mc cr and Φ pm_s = π · mc pm

Total Net Profit
If costs, fixed costs and profits are considered, the total net profit up to time ν is In the stationary case this is

A Numerical Example
The system modelled in this paper can be applied to real-world engineering problems. It would be interesting to examine whether or not preventive maintenance is profitable and to determine the optimum distribution for vacation time and hence the corresponding value of R.

The System
We assume a standby system composed of four units initially as described in this work. Each unit is composed of four performance internal states where the first two are considered minor damage and the last two as major damage. The transition probability matrix for wearing out time is given by The online unit is subject to external shocks. The time between two consecutive external shocks follows a phase-type distribution with representation (γ, L) being γ = (1, 0) and L = 0.9 0.05 0 0.5 .
The mean time between two consecutive accidental external failures is equal to 11 units of time.
Each time that the system suffers an external shock the internal performance can be modified by producing a repairable or non-repairable failure. The matrix that governs the changes into the operational states is When a unit undergoes a repairable failure or inspection observes major damage, this goes to the repair facility. Therefore, two types of tasks can be developed by the repairperson, corrective repair and preventive maintenance. Both are phase-type distributed with representation for the corrective repair time, The mean corrective repair time is 7.3810 units of time and for the preventive maintenance case this is equal to 2.5 units of time.

Costs and Rewards
Different costs and rewards have been considered as described in Section 5. We assume a gross profit while the system is operational, equal to B = 60. This is also the loss per unit of time while the system is not operational, C = 60. The online unit has a cost while it is operational depending on the operational phase. This vector is c 0 = (5, 12, 30, 40) . The repairperson can be on vacation or in his workplace. Each time that the repairperson returns on his vacation a cost equal to G = 20 is produced. While the repairperson is idle, a cost equal to H = 15 is produced.
The online unit can undergo a repairable failure. In this case, the unit goes to the repair facility for corrective repair. A fixed cost is considered for each failure equal to fcr = 10. Once in corrective repair, a cost depending on the state is given by cr 1 = (18,18,18) .
When inspection observes major damage, the unit also goes to the repair facility for preventive maintenance. A fixed cost is produced, fmi = 5. Once in the repair facility the cost will depend on the preventive maintenance state. This is given by the vector cr 2 = (15.5, 15.5, 15.5) . Finally, when all units undergo a non-repairable failure the system is re-started. It has a cost per unit equal to fnu = 100.

Optimization Analysis
The repairperson can take a vacation, for a random duration, and inspections may take place at random intervals. This circumstance raises two interesting questions. Firstly, if a distribution class is assumed for the duration of the vacation, from an economic standpoint what is the optimum distribution and the optimum value of R (i.e., the limit value of the number of operational units needed to require the repairperson to remain in the facility on returning from vacation) from an economic standpoint? Secondly, is it profitable to perform preventive maintenance?
To answer these questions, we consider two classes of distributions, the geometric distribution and the Erlang distribution, from which optimum values for R and the other parameters can be determined.

The Geometric Distribution Case
We assume that the vacation time of the repairperson is distributed geometrically with parameter p. Then, the p.m.f. is P{X = n} = p n−1 (1 − p); n = 0, 1, 2, . . .
The stationary net profit depending on p for the system with and without preventive maintenance is shown in Figure 4. This has been worked out from Section 5.2. We can see that, when the geometric distribution is considered, the optimum value is reached for the preventive maintenance case with p = 0.8 and R = 3. In this case, and in the stationary case, the net profit per unit of time would be equal to 22.0571.

The Generalized Erlang Distribution Case
Analogously to the geometric case, we assume now that the vacation time is distributed as a Generalized Erlang distribution with parameter shape equal to 2. This distribution can be expressed as a phase-type with representation (v, V) being ( ) . Figures 5 and 6 show the stationary net profit depending on the parameters p1 and p2 and R for the case without preventive maintenance and with preventive maintenance, respectively.

The Generalized Erlang Distribution Case
Analogously to the geometric case, we assume now that the vacation time is distributed as a Generalized Erlang distribution with parameter shape equal to 2. This distribution can be expressed as a phase-type with representation (v, V) being v = (1, 0); V = p 1 1 − p 1 0 p 2 . Figures 5 and 6 show the stationary net profit depending on the parameters p 1 and p 2 and R for the case without preventive maintenance and with preventive maintenance, respectively.

The Generalized Erlang Distribution Case
Analogously to the geometric case, we assume now that the vacation time is distr uted as a Generalized Erlang distribution with parameter shape equal to 2. This distrib tion can be expressed as a phase-type with representation (v, V) being ( ) . Figures 5 and 6 show the stationary net profit depending on the parameters p1 and and R for the case without preventive maintenance and with preventive maintenance, spectively.  We can see that, when the generalized Erlang distribution is considered for the v tion time, the optimum value is reached for the preventive maintenance case with p1 = 0.67 and R = 3. In this case, a stationary case, the net profit per unit of time woul equal to 22.4364.

The Optimum System with the Generalized Erlang Distribution
In section above we have worked out the optimum system. It is given when the eralized Erlang distribution is considered with parameters (2, 0.67, 0.67) and R = 3. In section the performance measures of this system are analysed.
Firstly, the time up to first time that the system is replaced (all units undergo a repairable failure), described in Section 4.3, has been analysed. The reliability functio plotted in Figure 7. Two cases are shown, with and without inspection.  We can see that, when the generalized Erlang distribution is considered for the vacation time, the optimum value is reached for the preventive maintenance case with p 1 = p 2 = 0.67 and R = 3. In this case, a stationary case, the net profit per unit of time would be equal to 22.4364.

The Optimum System with the Generalized Erlang Distribution
In section above we have worked out the optimum system. It is given when the generalized Erlang distribution is considered with parameters (2, 0.67, 0.67) and R = 3. In this section the performance measures of this system are analysed.
Firstly, the time up to first time that the system is replaced (all units undergo a nonrepairable failure), described in Section 4.3, has been analysed. The reliability function is plotted in Figure 7. Two cases are shown, with and without inspection. We can see that, when the generalized Erlang distribution is considered for the vacation time, the optimum value is reached for the preventive maintenance case with p1 = p2 = 0.67 and R = 3. In this case, a stationary case, the net profit per unit of time would be equal to 22.4364.

The Optimum System with the Generalized Erlang Distribution
In section above we have worked out the optimum system. It is given when the generalized Erlang distribution is considered with parameters (2, 0.67, 0.67) and R = 3. In this section the performance measures of this system are analysed.
Firstly, the time up to first time that the system is replaced (all units undergo a nonrepairable failure), described in Section 4.3, has been analysed. The reliability function is plotted in Figure 7. Two cases are shown, with and without inspection. From the corresponding phase-type distribution, the mean time up to a new system has been calculated in both cases. Thus, the mean time up to replacing the system for the case without inspection is 167.7631 u.t., and with inspection 172.5269 u.t. From the corresponding phase-type distribution, the mean time up to a new system has been calculated in both cases. Thus, the mean time up to replacing the system for the case without inspection is 167.7631 u.t., and with inspection 172.5269 u.t.
Multiple measures have been achieved for this system with and without inspection. These measures are described in Section 4. Table 2 shows the stationary distribution for macro-states U k , k units in the system. They can be interpreted as the proportional time that the system is in these macro-states. Performance measures are developed for the optimum system with and without inspection following Section 4. Table 3 shows the results. Table 3. Performance measures for the optimum system (without inspection between parentheses). The proportional time that the repairperson is on vacation is 0.3194. This fact is of interest for the total cost. Therefore, the repairperson is in his workplace for 0.6806 proportion of time and working for 0.3139 proportion of time. Then, the 46.12% of the time that the repairperson is in his workplace, he is working. The remaining time he is idle.
Regarding the mean number of events per unit of time we can observe that this is 0.0409 for repairable failures, 0.0049 for major inspection and 0.0058 for new systems. Thus, for each 10,000 units of time 58 new systems are expected to be re-started. The availability is also worked out. For 87.72% of the time the system is operational, a 0.23% increase than the without inspection case. Really this is low but the difference between both net profits is important, 5.79% maximum for the case with preventive maintenance.

Conclusions
Matrix analysis methods can be used to model a complex discrete cold standby system subject to multiple events. This method facilitates the algorithmic and computational development of multi-state complex systems. In the case in question, the online unit within the system is subject to wear and external shocks and may undergo periodic or random inspection. The repair facility is composed of a single repairperson, who may take a vacation (absence) from the repair facility. This repairperson may perform corrective repair and/or preventive maintenance.
The system described is not the standard one in which units are replaced when they undergo a non-repairable failure. In the present study, the analysis takes account of the loss of units following the occurrence of a non-repairable failure. When such a failure occurs, the system continues working with one less unit. This outcome often occurs in practice, and is reflected in the study method presented.
The (indeterminate) number of units within the repair facility and the vacation policy applied determine the behaviour of the repairperson. The vacation time begins when the number of operational units exceeds a given value, and the repairperson will remain in place, without taking a vacation, if the number of operational units in the system is below a pre-determined value.
The system is modelled in an algorithmic and computational form by means of a Markovian Arrival Process with marked arrivals. Matrix-analytic methods are used to obtain the stationary distributions, and multiple measures are derived using a matrix. These measures are related to system performance and financial results.
The method presented in this paper enables us to analyse optimization problems in multi-state complex systems. A numerical example of such an optimization is presented. The results obtained show whether preventive maintenance is profitable and reveal the optimum number of operational units, hence determining the appropriate policy for the repairperson's vacation times.
Funding: This paper is partially supported by the project FQM-307 of the Government of Andalusia (Spain) and by the project MTM2017-88708-P of the Spanish Ministry of Science, Innovation and Universities (also supported by the European Regional Development Fund program, ERDF).

Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.

Conflicts of Interest:
The author declares no conflict of interest.

Appendix A. Transition Probability Matrix Blocks for the Online Unit Depending on Type of Event
Appendix B.1. Matrices for the Markovian Arrival Process Depending on the Type of Event The matrices D A and D B are developed in the text. The rest are given below.

Appendix B.2. The Matrix D O
The matrix D O contains the transitions when a none-event occurs. This matrix is composed of blocks according to the transitions between the macro-states U k for k = 1, . . . ,n. It is given by Therefore, for the different macro-states, this is given by:

. The Matrix D AD and D BD
The matrices D AD and D BD contain the transitions when the repairperson resume work and at same time a repairable failure or major inspection occur. In this case, for Y AD, BD we have k

. The Matrix D AD and D BD
The matrices D AD and D BD contain the transitions when the repairpe work and at same time a repairable failure or major inspection occur. In th AD, BD we have

. The Matrix D AD and D BD
The matrices D AD and D BD contain the transitions when the repairperson resumes work and at same time a repairable failure or major inspection occur. In this case, for Y = AD, BD we have  ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  0  1  1  1  1  1  1 , 0 The matrix D C contains the transitions when only a non-repairable failure occurs. this case the matrix is , , , The matrix D C contains the transitions when only a non-repairable failure occurs. In this case the matrix is , , , , , 00 For r = 1,…, k−1; For r = 2,…, k−1, , , 00 For r = 1,…, k−1; For r = 2,…, k−1,       The matrix D NS contains the transitions when a failure provokes the system to be restarted. Obviously, in this case the system is composed of only one unit. When this one is broken, a new system with n units re-starts. When this occurs, the vacation time begins again. The structure of the matrix is

Appendix C
To calculate the expected times that the repairperson returns to the workplace, independently of whether he remains or begins another period of vacation, the following matrix Q is defined. This matrix is built analogously to the matrix D, but any return is considered. Therefore, the matrix Q is the addition of the following matrices  The matrix D NS contains the transitions when a failure provokes the system to be restarted. Obviously, in this case the system is composed of only one unit. When this one is broken, a new system with n units re-starts. When this occurs, the vacation time begins again. The structure of the matrix is

Appendix C
To calculate the expected times that the repairperson returns to the workplace, independently of whether he remains or begins another period of vacation, the following matrix Q is defined. This matrix is built analogously to the matrix D, but any return is considered. Therefore, the matrix Q is the addition of the following matrices

Appendix C
To calculate the expected times that the repairperson returns to the workplace, independently of whether he remains or begins another period of vacation, the following matrix Q is defined. This matrix is built analogously to the matrix D, but any return is considered. Therefore, the matrix Q is the addition of the following matrices The matrices D D , D AD , D BD , D CD are described in Appendix B. The other matrices have the same structure for the corresponding event given in Appendix B. These matrices are of zeros, excepting the following blocks.