Next Article in Journal
Discovering Geometric Inequalities: The Concourse of GeoGebra Discovery, Dynamic Coloring and Maple Tools
Next Article in Special Issue
Lie-Group Modeling and Numerical Simulation of a Helicopter
Previous Article in Journal
Discrete Hypergeometric Legendre Polynomials
Previous Article in Special Issue
“Holographic Implementations” in the Complex Fluid Dynamics through a Fractal Paradigm
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Reliability Simulation of Two Component Warm-Standby System with Repair, Switching, and Back-Switching Failures under Three Aging Assumptions

1
Australian Maritime College, University of Tasmania, 1 Maritime Way, Launceston, TAS 7250, Australia
2
Nikola Vaptsarov Naval Academy—Varna, 73 V. Drumev Str., 9002 Varna, Bulgaria
*
Author to whom correspondence should be addressed.
Mathematics 2021, 9(20), 2547; https://doi.org/10.3390/math9202547
Submission received: 23 August 2021 / Revised: 1 October 2021 / Accepted: 6 October 2021 / Published: 11 October 2021
(This article belongs to the Special Issue Mathematical Modeling and Simulation in Mechanics and Dynamic Systems)

Abstract

:
We analyze the influence of repair on a two-component warm-standby system with switching and back-switching failures. The repair of the primary component follows a minimal process, i.e., it experiences full aging during the repair. The backup component operates only while the primary component is being repaired, but it can also fail in standby, in which case there will be no repair for the backup component (as there is no indication of the failure). Four types of system failures are investigated: both components fail to operate in a different order or one of two types of switching failures occur. The reliability behavior of the system is investigated under three different aging assumptions for the backup component during warm-standby: full aging, no aging, and partial aging. Four failure and repair distributions determine the reliability behavior of the system. We analyzed two cases—in the First Case, we utilized constant failure rate distributions. In the Second Case, we applied the more realistic time-dependent failure rates. We used three methods to identify the reliability characteristics of the system: analytical, numerical, and simulational. The analytical approach is limited and only viable for constant failure rate distributions i.e., the First Case. The numerical method integrates simultaneous Algebraic Differential Equations. It produces a solution in the First Case under any type of aging, and in the Second Case but only under the assumption of full aging in warm-standby. On the other hand, the developed simulation algorithms produce solutions for any set of distributions (i.e., the First Case and the Second Case) under any of the three aging assumptions for the backup component in standby. The simulation solution is quantitively verified by comparison with the other two methods, and qualitatively verified by comparing the solutions under the three aging assumptions. It is numerically proven that the full aging and no aging solutions could serve as bounds of the partial aging case even when the precise mechanism of partial aging is unknown.

1. Introduction

Assessing the reliability of a system is a key engineering task that has economic and safety implications. Having a better understanding of failure/repair rates of system components is a tool to design highly reliable systems and conduct repair operations at adequate cost levels while complying with adequate and reasonable maintenance schedules. A common approach for improving the reliability is to provide redundancy for excessively failing components. The redundant components may operate simultaneously in a sense that the system will never fail if at least one of the parallel components operates [1]. Another possibility is to design a “k out of n” configuration where all n components are in operation and the system is not failing if at least k of them operate properly [2]. However, the standby arrangement is the simplest, cheapest and the most utilized one; the system operates with some of its components (called primary) whereas the redundant components (called backup components) are in standby, but when a primary component fails, they take its place [3]. In this paper, we will focus on a two-component system with a standby arrangement where the backup components may fail either while in standby, or during operation after some imperfect switching mechanism has put those online. The switching mechanism can be continuous type when it actively monitors the primary component and makes its own decisions, but it can malfunction at any time [4]. However, in this paper, we will treat exclusively the widespread mechanisms that can fail only on demand when the switching is needed [5]. According to the failure intensities of the backup component, such systems are classified as cold-standby, warm-standby, and hot-standby [6]. In the hot-standby system, the intensity of failures of the backup component is the same during standby and operation, whereas in the cold-standby, there is no failure in standby. We will concentrate on the two-component warm-standby systems where there are failures of the backup component in standby, but with smaller intensity compared to its operational mode.
Additionally, the reliability of a system with backup components depends on the way of aging of the backup components while in standby. Previous works have identified three types of aging of the backup component during warm-standby: full aging, no aging and partial aging [7]. The full aging assumption means that the component changes its failure/repair rate during standby as if it is operational. Under the no aging assumption, the component does not change its failure/repair rate during standby. The partial aging assumption models the intermediate situation where the backup component experiences some wear during standby, but at a slower rate than if operational.
If some components of the system are deemed repairable, the system can be brought to its full operational capacity by replacing parts or by making adjustments [8]. In most works, the focus is on single-component repairable systems under various repair activities. A detailed discussion on how such tasks can be approached with modern statistical tools is offered in [9]. There are different types of repairs that can be adopted depending on objectives. The first possibility is the so-called perfect repair (a.k.a. as-good-as-new (AGAN) repair), where the primary component fails and it is replaced or restored to its original or good-as-new condition [10]. Minimal repair restores the device to the condition it was in immediately before the failure [11] (pp. 226–227). There may also be intermediate types of repairs (e.g., the partial perfect repair procedures mentioned in [8]). In the current work, the focus is on the case of minimal repair of the special type worse-than-old (WTO) [12]. The assumption is that during this repair, the non-repaired elements of the primary unit age as if the latter was operational.
In this paper, we will investigate the effects that adding repair and back-switching failures to a two-component warm-standby system with switching mechanism has on the reliability of the system. Our goal is to analyze how this affects the system reliability under different aging assumptions in standby. In such a system, the primary component begins operation, and when it fails, the system will try to activate the backup component, but a switching failure is possible. When the backup component is operational, the primary component undergoes minimal repair. If the latter finishes before the backup component fails, the system will try to activate back the primary component, but again a back-switching failure is possible. However, it is possible that the backup component will fail in standby. In that case, there will be no repair for the backup component since there will be no indication of the failure. The system is considered to have failed when either both components fail to operate at any given time, or when, after primary component failure, the switching mechanism fails on demand to switch the system operation to the secondary component (switching failure), or when, after a successful repair, the switching mechanism fails on demand to switch the system operation back to the primary component (back-switching failure). The primary component undergoes minimal repair, i.e., the primary component experiences full aging during the repair. The reliability behavior of the system will be investigated under three different aging assumptions for the backup component during standby: full aging, no aging and partial aging. Only mechanical aging of the components will be considered, which excludes any influence by some software aging (for discussion of the latter topic see [13,14]).
The focus of our investigation would be a two-component warm-standby system with repair, switching, and back-switching failures, denoted as 2SBRSBF. Our study concentrates on the characteristics of the uptime of the 2SBRSBF. We only consider repairs of the failed primary component of a working system. The repair of the failed system, which relates to downtime, is an important component of system availability, but it is outside of the scope of our paper (for elaborate case study of data center availability using Markovian modeling, see [14]). In [15], the causes of system and component failures were classified as technological failures, natural disaster failures, and man-made disasters (e.g., terrorism). In our study, we will consider only the technological failure of 2SBRSBF since the other two types tend to cause dependent component failures, which is outside of the scope of our study. Here, the standby mode of the 2SBRSBF is defined as a situation, where the primary component is working properly, and the backup component is fully operational, but its failure will not affect the normal operation of the system at this moment (in [16] such component configuration is classified as “active/cold-standby”).
The reliability behavior of the 2SBRSBF depends on four distributions: the failure and the repair distributions of the primary component, the failure distributions of the backup component in operation and in standby. We will analyze two cases for those distributions. In the First Case, all distributions will be with constant failure/repair rates. In the Second Case, the more realistic time-dependent failure/repair rates will be applied.
We will use three methods to identify the reliability characteristics of the 2SBRSBF: analytical, numerical, and simulational. The analytical approach is applicable for the First Case distributions. We will develop novel analytical solutions for the state probability functions in the case of exponential distributions. The numerical method creates and integrates simultaneous Ordinary Differential Equations (ODEs) for 2SBRSBF. This method is applicable for any set of First Case distributions and for Second Case distributions under the assumption of full aging in standby. However, there are no simultaneous ODEs that describe the behavior of 2SBRSBF with time-dependent distributions (i.e., Second Case) under no aging or partial aging assumptions in standby. To facilitate the simulational solution, we will introduce a novel method to generate failure times of the backup component in standby under the assumptions of full aging, no aging, or partial aging. Using this method, we will modify and generalize the algorithm from [17] to simulate the behavior of 2SBRSBF and to calculate its most important reliability characteristics. That algorithm will produce a novel simulation solution for any set of distributions (i.e., the First Case and the Second Case) under any of the three aging assumptions for the backup component in standby. The proposed algorithm will be validated quantitively by comparing with the analytical and with the numerical solutions (if those exist) as well as quantitatively by comparing with the full aging results.
In what follows, Section 2 summarizes the state-of-the-art in the field and outlines the contributions of our paper. Section 3 will setup the problem for reliability characteristics assessment of a 2SBRSBF function. In Section 4, we present a novel analytical solution of the formulated problem in the case of distributions with constant failure/repair rate. A numerical solution will be identified in Section 5 where a system of four simultaneous deferential algebraic equations will model the 2SBRSBF in the case of full aging of the backup component during standby. In Section 6, the same problem will be solved with simulation which can be used with any distributions under three different assumptions about the aging mechanism of the backup component. Section 7 contains the results of three numerical examples, where we validate the proposed simulational algorithm quantitatively (by comparing with the analytical and the numerical solutions when those exist) and qualitatively (by checking whether the effects of no aging and partial aging correspond to the logically expected ones). Section 8 concludes the paper.

2. Related Works and Contributions of the Paper

Although the publications about warm-backup system reliability are growing recently, they are rare in comparison with reliability studies of cold-backup and hot-backup system, since the realistic models of the former tend to be more elaborate [18]. In [19] (pp. 113–115), the analytical solution for two-component warm-standby system with switching failure (2SBSF) was developed. The switching mechanism fails on demand. The failure distributions were considered exponential. Hence, no aging effect was taken into account. Explicit formulae were derived for the reliability of the system and for all state probability functions. In [20] (pp. 167–170), a model of a two-component warm-standby system (2SB) with arbitrary failure distributions was proposed. Although no particular simulation algorithm was developed, general advice was given on how to acquire the reliability function and the state probability functions using Monte Carlo simulation and how to deal with different aging assumptions. In [21], a model of a 2SB system was proposed under general standby, which generalizes the three special cases of warm-, cold- and hot-standby. The failure distributions can be arbitrary. The aging effects are accounted for using a pre-specified virtual aging function. An integral equation, connecting the failure rates and the virtual aging function with the reliability of the system was proposed. In [6], these results from [21] were generalized to solve the problem of allocation of redundancy that includes two independent and one generalized standby component. The reliability and the state probability functions of a generic two-component standby system under full aging, no aging, and partial aging were identified with a simulation algorithm in [22] using arbitrary failure distributions. That solution is verified with analytical and numerical special cases. The results from this work were expanded in [17] to model the 2SBSF, but some numerical problems connected with random variate generation and arbitrary failure rate calculations were resolved.
The majority of the above models consider aging effects, but none of them has repairs.
A 14-states model of two dissimilar warm-standby subsystems in series with repair were discussed in [23]. The failure distributions are exponential, and the system is with constant repair rates. The type of repair is AGAN. The problem of aging is not considered. Some analytical steady-state characteristics of the system are provided using Laplace transforms. Those characteristics for two-component warm-standby system with repair (2SBR) can be obtained as special cases from the results in the paper. The work [4] performs reliability analysis for a two-component warm-standby system with repair and switching failures (2SBRSF). The failure and repair rates are constant. The switching mechanism is of continuous type and has its own failure distribution. This leads to a possibility of repairing the failed backup unit while the primary component operates. All failure and repair distributions are exponential. Any failure of the switch leads to system failure. The repairs are AGAN and no aging is considered. The system has 10 states. The reliability and the state probability functions of the system were identified with a numerical algorithm as a solution of an ODE system. Another interesting two-identical-component standby system is given in [24]. The type of standby is difficult to determine since the failure in standby mode is deterministic and happens after surpassing a pre-specified time. The failure distribution of the operating unit is exponential, but the repair rate is arbitrary. There is no switching failure, but the switching mechanism inspects the failed standby unit and decides whether to replace it or to repair it. No aging is considered in this model. Some steady-state measures of reliability are obtained using semi-Markov models. In [25], the authors propose a system with m identical components working in parallel with s components in warm-standby. The system includes a service station that can also fail and be repaired. There are no switching failures, and all failure and repair distributions of the components are exponential. The failure and repair distributions of the service station are also exponential. The repairs are AGAN, and no aging is considered. The reliability and the state probability functions of the system were approximated using symbolic computer software. The work [18] presents a system of n components in series with one component in warm-standby. There are neither switching failures nor aging considerations. The failure distributions are exponential, but the repair distributions are arbitrary. The system is also subjected to non-repairable failures. Some reliability and availability steady-state characteristics of the system are derived using Laplace transforms. In [26], the authors discuss a three identical component warm-standby system. Initially, the primary component is working, and the other two are in standby. The failure of the operating unit and the repairs are with random distributions, however in standby there is constant failure rate. The repairs are AGAN, there are no switching failures, and no aging is considered. An integral equation, connecting the failure rates with the reliability of the system is proposed.
The models with repair discussed above do not consider any aging effects.
In Table 1, we summarized seven characteristics for each of the above-discussed 12 papers plus the current work. The information in Table 1 highlights the novelty of our work against the discussed state-of-the-art studies in the literature. The contributions of our study can be outlined as follows:
  • We shall formulate a novel model of 2SBRSBF containing three operational states and four system failure substates. The switching mechanism will fail on demand and the repair of the primary unit will be WTO. This warm-standby system will utilize arbitrary failure and repair distributions and will have three types of aging modes of the backup component in warm-standby—no aging, full aging, and partial aging.
  • We shall create a novel six-attribute procedure, which gives numerically stable estimates of the equivalent age of the backup unit under any of the three aging assumptions.
  • We shall formulate 11 properties of the event chain (EC) describing the 2SBRSBF that can happen during the normal exploitation of the system.
  • We shall develop a novel algorithm to generate a random EC for the 2SBRSBF, which satisfies the EC properties in step 3 above.
  • We shall propose a simulation algorithm to calculate the state probability functions and the rest of the reliability characteristics of a 2SBRSBF in their dynamics.
  • We shall develop a novel analytical solution of the 2SBRSBF when the failure and the repair rates are constant. We will prove that the solution is real for any constant failure/repair rates and switching mechanism failure probabilities.
  • We shall develop a numerical solution of the 2SBRSBF under the assumption of full aging of the backup component in warm-standby. The procedure will use a semi-explicit system of four simultaneous differential algebraic equations (DAEs) with differential index 1, singular constant mass matrix, and Jacobian matrix depending only on the time. The main novelty is the calculation of stable approximations of the failure/repair rates at any moment of time.
  • We shall verify quantitatively the results from the simulation procedure using analytical and numerical solutions in special cases of the 2SBRSBF. The solutions in the three aging modes will serve as qualitative validation of the simulation solution.

3. States, Transition Rates, and Distributions

The dynamics of a 2SBRSBF system can be determined by its transition between several possible states [27]. The 2SBRSBF has four major states, but State 4 (where the 2SBRSBF system is not operational) is subdivided into 4 substates, called types.
In State 1, the primary component operates, the backup component is fully operational but is in standby. Sooner or later, one of the two components will fail:
(A)
If the primary component fails, the system will attempt a transit to State 2, where the backup component operates and the primary component is under repair. However, if the switching device fails to operate properly, we observe the so-called switching failure on demand resulting in transition to State 4, where the 2SBRSBF system is not operational (type a system failure).
(B)
If the backup component fails in standby, the system will transit to State 3 where the primary component operates but the backup component is not operational. There will be no indication whether the system is in State 1 or in State 3, so no maintenance decision will be made in those two states.
In State 2 sooner or later either the primary component will be repaired, or the backup component will fail. Then one of the following two events will occur:
(A)
If the primary component is repaired, the system will try a transit to State 1. However, if the switching device fails to operate properly, we observe the so-called back-switching failure resulting in transition to State 4, where the 2SBRSBF system is not operational (type b system failure).
(B)
If the backup component fails in operation, the system will transit to State 4 where the 2SBRSBF system is not operational (type c system failure).
In State 3, sooner or later, the primary component will fail and there will be no operational backup component to take over. The system will transit to State 4 where the 2SBRSBF system is not operational (type d system failure).
The State 4, where the 2SBRSBF system is not operational, is irreversible in our model regardless of the type of the system failure.
The described system is partially observable since we will not know whether the system is in State 1 or in State 3, but State 4 and State 2 are observable. At the same time, 2SBRSBF is controllable by three trivial event-driven decisions: (a) when the primary component fails, attempt to move to State 2, by switching to the backup unit; (b) when the backup unit is in operation, start repairing the primary component; (c) when the primary component is repaired, attempt to move to State 1, by back-switching to the primary unit.
The state function Pg(t) (for g = 1,2,3,4) measures the probability of the 2SBRSBF to be in State g at time t (for t ≥ 0). Since the system will be in one state and in one state only at any non-negative time moment t, then:
P 1 t + P 2 t + P 3 t + P 4 t = 1 ,   for   t 0 ,
The 2SBRSBF system starts in fully operational mode so initially it will be in State 1:
P 1 0 = 1   and   P 2 0 = P 3 0 = P 4 0 = 0
If the four state functions are identified, then the 2SBRSBF system is quantitatively described and we can calculate all its reliability characteristics. The reliability of the system is the sum of the first three state probabilities (i.e., the probability not to be in State 4):
R s y s t = P 1 t + P 2 t + P 3 t = 1 P 4 t ,   for   t 0 ,  
The mean time to failure (MTTF) of the 2SBRSBF system can be calculated as:
M T T F s y s = 0 R s y s t d t
The time for which the reliability of the system will be α is known as α-design life (tdes,α). It can be identified as the unique solution of Equation (5) in the domain t d e s , α 0 , :
R s y s t d e s , α = α ,   for   α 0 , 1
The median (Mediansys), the B1 life (B1_life), the B10-life (B10_life), and the interquartile range (IQRsys) of the 2SBRSBF system reliability can be easily estimated using Equation (5) respectively as t d e s , 0.5 , t d e s , 0.99 ,   t d e s , 0.9 ,   and   t d e s , 0.25 t d e s , 0.75 [20] (pp. 87–88).
To identify the four required state functions of the 2SBRSBF system, we need to know:
  • The probability, pf, for switching failure on demand.
  • The probability, pr, for back-switching failure on demand.
  • The probability density function (PDF), f1(t), of the failure distribution for the primary component in operation.
  • The PDF, f2(t), of the failure distribution for the standby component in operation.
  • The PDF, f3(t), of the failure distribution for the standby component in standby.
  • The PDF, f4(t), of the repair distribution for the primary component.
Each of the four PDFs, fk(t), (for k = 1, 2, 3, 4) can be transformed into four alternative forms: a cumulative distribution function (CDF), Fk(t), a failure/repair rate, λk(t) (as shown in [28]), a complementary CDF, or Rk(t), and an inverse CDF, i.e., F k 1 p . The five forms fk(t), Fk(t), λk(t), Rk(t), and F k 1 p contain the same information and are equivalent. In the ideal world the domain of the first four functions and the range of the last one will be   t 0 , where t can be interpreted as time. However, this is not always the case—those failure and repair distributions are based on information about the behavior of the components. The first step is to summarize the available information in several nodes of the CDF. If the reliability information is in the form of fully observed or multiply sensor data, then we can produce an empirical distribution, using either the Kaplan-Meier product limit estimator method [29] (see the function ecdf.m in [30], which embodies the method) or the invertible ECDF estimator with maximum count of nodes [31], or any other modern method. If the information is in the form of expert knowledge, then we can extract subjective quantiles using the triple bisection method [32] as described in [33]. The second step is to fit a parametric distribution of some type to the nodes of the CDF identified in the first step. The work [20] (p. 399) gives several reasons to use parametric distributions rather than empirical ones, with the most important one being that empirical distributions can only be trusted at the beginning of the failure/repair process. Regardless of the method utilized to identify the parameters in the second step (least square, maximum likelihood estimation, Bayesian estimation, etc.), it is quite possible that some of the derived parametric distributions would have substantial support for negative values of the argument t. For purely pragmatic reasons, we assume that for each k, we are given only procedures to calculate fk(t), Fk(t), and F k 1 p . Such numerical procedures exist in almost any software package. For example, the Statistics and Machine Learning Toolbox in MATLAB contains the pdf.m, cdf.m, and icdf.m which calculate the PDF, the CDF, and the inverse CDF values for any distribution object created by the makedist.m [30]. The latter can choose a wide variety of parametrical 1D distributions with arbitrary specified parameters. Unluckily, some of those parametrical distributions are defined over the whole real axis (e.g., the normal distribution, or the extreme value distribution). Traditionally, no numerical procedures are given for estimating the values of λk(t) and Rk(t), which have to be approximated using fk(t), Fk(t), F k 1 p . In this paper, any of the procedures fk(t), Fk(t), Rk(t), λk(t), F k 1 p will be called the kth original distribution since the five of them describe in alternative form the uncertainty of a real continuous variable:
a f k t ,   for   k = 1 , 2 , 3 , 4   with   Domain   t , +   b F k t = t f k t d t ,   for   k = 1 , 2 , 3 , 4   with   Domain   t , +   c λ k t = f k t / 1 F k t ,   for   k = 1 , 2 , 3 , 4   with   Domain   t , +   d R k t = 1 F k t ,   for   k = 1 , 2 , 3 , 4   with   Domain   t , +   e F k 1 p ,   for   k = 1 , 2 , 3 , 4   with   Domain   p 0 , 1
Here, Rk,(t) from Equation (6) a is aka original reliability/repair function when the real argument t is non-negative and can be interpreted as time. In our problem, the argument t would be most often the time (or other suitable non-negative variable, e.g., mileage), so we will use the original distribution in Equation (6) a–e to approximate their truncated versions which take the form of conditional distributions provided that the failure/repair has not happened till time 0:
a f k , t r u n t = f k t | 0 = f k t / R k 0 , for k = 1 , 2 , 3 , 4 with Domain t 0 , + b F k , t r u n t = F k t | 0 = 1 R k t / R k 0 , for k = 1 , 2 , 3 , 4 with Domain t 0 , + c λ k , t r u n t = λ k t | 0 = f k , t r u n t / 1 F k , t r u n t , for k = 1 , 2 , 3 , 4 with Domain t 0 , + d R k , t r u n t = R k t | 0 = 1 F k , t r u n t , for k = 1 , 2 , 3 , 4 with Domain t 0 , + e F k , t r u n 1 p = F k 1 p | 0 , for k = 1 , 2 , 3 , 4 with Domain p 0 , 1
In this paper, any of the functions fk,trun(t), Fk,trun(t), Rk,trun(t), λk,trun(t), F k , t r u n 1 p will be called the kth truncated distribution, since the five of them describe in alternative forms the uncertainty of a real non-negative continuous variable which can be interpreted as time. The Rk,trun(t) from Equation (7) d is aka truncated reliability/repair function. Let us concentrate on the 2SBRSBF system at time t:
  • The rate for transitioning between State 1 and State 2 will depend on P1(t), on pf, and on the conditional failure distribution f1(τ|t) (failure density of the primary component in operation, given that it has not failed till time t). The reason is that any possible previous repairs of the primary component were from minimal type which equates to the full aging assumption for the primary component during repair and any failure will behave like a first failure at time t.
  • The rate for transitioning between State 1 and State 4 (type a system failure) will depend on P1(t) and on the conditional failure distribution f1(τ|t) since the same arguments made for the State 1–State 2 transition apply.
  • The rate for transitioning between State 3 and State 4 (type d system failure) will depend on P3(t), on pf, and on the conditional failure distribution f1(τ|t) since the same arguments made in the State 1–State 2 transition apply.
  • The rate for transitioning between State 2 and State 1 will depend on P2(t), on pr, and on the conditional repair distribution f4(τ|t) (repair density of the primary component, given that the repair starts at time t). The reason is that any possible previous repairs of the primary component were from minimal type, which equates to the full aging assumption for the primary component during operation and any repair will look like a first repair at time t.
  • The rate for transitioning between State 2 and State 4 (type b system failure) will depend on P2(t), on pr, and on the conditional repair distribution f4(τ|t) since the same arguments made in the State 2–State 1 transition apply.
  • The rate for transitioning between State 1 and State 3 will depend on P1(t) and on the conditional failure distribution f3(τ|t) (failure density of the backup component in standby, given that it has not failed till time t). The backup component is never repaired until there is a system failure, which suggests that the failure rate in standby should depend only on the time the system operates but not on the backup component history of utilization (alternating between operational and standby modes).
  • The rate for transitioning between State 2 and State 4 (type c system failure) will depend on P2(t) and on the conditional failure distribution f2(τ|tage) (failure density of the backup component in operation, given that it has not failed till time tage). Here tage is the equivalent aging of the backup component in operation. It depends on the type of aging and possibly on the backup component history of utilization (alternating between operational and standby modes).
The four state functions of 2SBRSBF system can be identified using computer simulation in the above setup for any set of distributions and aging assumptions during standby. However, for verification purposes, two alternative solution methods can be developed for some special cases of the 2SBRSBF system. This approach was successfully applied in [34] for verification of a novel simulation-based optimization algorithm used in redundancy allocati on problems using Markovian models as special cases.
If we have a set of First Case distributions, then all state transitions will depend on the absolute densities, rather than from conditional ones. The reason is that the exponential distributions have no memory, and hence any aging assumptions are irrelevant. Then the probabilities for transitioning between the states depend only on the current state of the system, but not on the history describing how the system turns out to be in the current state. This means that the 2SBRSBF system with First Case distributions degenerates to a Markov model [27] (more precisely to a partially observable Markov decision process [35]). Such Markov model can be conveniently visualized with the Rate Diagram (RD) [20] (pp. 155–170) shown in Figure 1a. Using that RD, we will derive an analytical solution for the four state probability functions of the 2SBRSBF system with First Case distributions.
If we have a system with full aging assumption during standby, then the equivalent aging of the backup component in operation rate tage, described above in the transitioning between State 2 and State 4 (type c system failure) will be simply the current time t. The reason is that the backup component is assumed to age during standby in the same fashion as in operation, which shows that any failure of the backup component during operation will behave like a first failure at time t. This means that the 2SBRSBF system with full aging assumption degenerates to a semi-Markov model where the transition probabilities depend not only on the current state but also on the current time [36]. The semi-Markov model can be conveniently visualized with the Generalized Rate Diagram (GRD) shown in Figure 1b [37] (pp. 521–526). Using the GRD, we can describe the 2SBRSBF system with simultaneous ODEs. This is possible because the failure/repair rate of any distribution, F(t), at time t* coincides with the failure/repair rate of the conditional distribution F(τ|T) at the same time t*, if Tt*. This trivial fact is proven in Appendix A. The derived Cauchi problem can be solved numerically. Obviously, such solution exists also for the First Case distribution, which will allow the comparison of the analytical and the numerical solutions.
Neither the analytical, nor the numerical solutions can be derived for the cases of the Second Case distribution under the assumptions of no aging and partial aging since general aging effects cannot be described by any Markovian or semi-Markovian model and there is no system of ODE which fully quantifies the reliability behavior of 2SBRSBF unless when the primary component is subjected to full aging in standby (see [14,36]).

4. Analytical Solution

This solution is applicable only for First Case distributions, where the failure/repair rates are constant. The rate diagram in Figure 1a can be represented as a system of three ODEs from Equation (8) about the first three state probability functions [38]:
d P 1 d t t = λ 1 + λ 3 P 1 t + 1 p r λ 4 P 2 t d P 2 d t t = 1 p f λ 1 P 1 t λ 4 + λ 2 P 2 t d P 3 d t t = λ 3 P 1 t λ 1 P 3 t
The initial conditions are given in Equation (2). After solving Equation (8), the last probability function, P4(t), can be estimated from Equation (3) as the complement to 1 of the other state probability functions. The analytical solution of 2SBRSBF with First Case distributions can be described as: “set the constants from Equation (9) and form the state probability functions from Equation (10)” (see Appendix B for the proof).
K = λ 1 + λ 2 + λ 3 + λ 4 / 2 ; C = λ 1 + λ 3 λ 2 + λ 4 1 p f 1 p r λ 1 λ 4 s 1 = K + K 2 C ; s 2 = K K 2 C A 1 = s 1 + λ 2 + λ 4 s 1 s 2 ; B 1 = s 2 + λ 2 + λ 4 s 2 s 1 ; A 2 = 1 p f λ 1 s 1 s 2 ; B 2 = 1 p f λ 1 s 2 s 1 A 3 = λ 3 s 1 + λ 2 + λ 4 s 1 s 2 λ 1 + s 1 ; B 3 = λ 3 s 2 + λ 2 + λ 4 s 2 s 1 λ 1 + s 2 ; C 3 = λ 3 λ 1 + λ 2 + λ 4 λ 1 s 1 λ 1 s 2
  Domain :   t 0 , P 1 t = A 1 e s 1 t B 1 e s 2 t P 2 t = A 2 e s 1 t B 2 e s 2 t P 3 t = A 3 e s 1 t B 3 e s 2 t + C 3 e λ 1 t P 4 t = 1 A 1 + A 2 + A 3 e s 1 t + B 1 + B 2 + B 3 e s 2 t C 3 e λ 1 t
The reliability of the system from Equation (11) and its MTTF from Equation (12) are derived as special cases of Equations (3) and (4):
R s y s t = A 1 + A 2 + A 3 e s 1 t B 1 + B 2 + B 3 e s 2 t + C 3 e λ 1 t ,   for   t 0 ,  
M T T F s y s = A 1 + A 2 + A 3 / s 1 + B 1 + B 2 + B 3 / s 2 + C 3 / λ 1

5. Numerical Solution

This solution is applicable for any Second Case distribution with full aging of the backup component in standby and for any First Case distribution. The GRD in Figure 1b can be represented as a system of four simultaneous DAEs from Equation (16) about the four state probability functions, Pg(t) (g = 1,2,3,4). The system of DAEs will be numerically integrated from 0 to tend, where the latter will be selected sufficiently large, so R s y s t e n d 0   ( < 0 . 01 ) . The main numerical difficulty in solving Equation (16) is to advise a procedure for stable approximation of the failure/repair rates, λk(t) (k = 1,2,3,4), at any t 0 , t e n d . That problem is far from trivial since sometimes Fk(t) is so close to 1, that the denominator of Equation (7) turns into 0. For each of the four distributions, using the original inverse CDF function, we can calculate the time t λ , k , where the denominator of Equation (7) equals to 100 times the machine epsilon (ϵ):
t λ , k = F k 1 1 100 ε ,   for   k = 1 , 2 , 3 , 4
The approximated failure/repair rate, λk,a(t) (k = 1,2,3,4) equals to Equation (7) if its denominator is greater than 100ϵ or equals the failure/repair rate at t λ , k otherwise:
λ k , a = f k t / 1 F k t ,   t 0 , t λ , k f k t λ , k / 1 F k t λ , k ,   t t λ , k , ,   where   k = 1 , 2 , 3 , 4
Equation (14) produces numerically stable approximations of the failure/repair rates at any non-negative time not greater than tend. This is true even when a distribution is truncated which means that Fk(0) > 0 and only its part in the non-negative domain has to be used. Then, according to Appendix A, the value of the failure/repair rate for any non-negative time will be the same as that of the non-truncated distribution since the truncated distribution can be represented as a conditional nontruncated one:
F k , t r u n t = F k t | T 0 = 0 = 1 1 F k t / 1 F k 0 ,   t 0
Now, we can write the DAE system corresponding to Figure 1b:
d P 1 d t t = λ 1 , a t + λ 3 , a t P 1 t + 1 p r λ 4 , a t P 2 t d P 2 d t t = 1 p f λ 1 , a t P 1 t λ 4 , a t + λ 2 , a t P 2 t d P 3 d t t = λ 3 , a t P 1 t λ 1 , a t P 3 t 0 = P 1 t + P 2 t + P 2 t + P 2 t 1
The dependent variables can be organized in a 4D vector: y t = P 1 t , P 2 t , P 3 t , P 4 t T . The DAE from Equation (16) is semi-explicit with differential index 1. It has a singular constant mass matrix:
M t , y = 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0
The Jacobian matrix of the RHS of Equation (16) depends only on the time t:
J t , y = λ 1 , a t λ 3 , a t 1 p r λ 4 , a t 0 0 1 p r λ 1 , a t λ 4 , a t λ 2 , a t 0 0 λ 3 , a t 0 λ 1 , a t 0 1 1 1 1
The initial conditions given in Equation (2) together with Equations (16) and (17) form a Cauchi problem:
M t , y y ' t = f t , y t   with   y i n i = y 0 = P 1 0 , P 2 0 , P 3 0 , P 4 0 T = 1 , 0 , 0 , 0 T
Here, y ' t = d P 1 t / d t , d P 2 t / d t , d P 3 t / d t , d P 4 t / d t T , M t , y is the mass matrix (17), and the 4D f t , y t is the RHS of Equation (16). The problem from Equation (19) can be numerically integrated (e.g., using ode15s.m from MATLAB [39]) at 2000 evenly distributed time points from 0 to tend:
t i = i 1 t e n d / 1999   ,   for   i = 1 , 2 , , 2000
The reliability function and the MTTFsys can be calculated approximating Equations (3) and (4) as:
R s y s t i = 1 P 4 t i   ,   for   i = 1 , 2 , , 2000
M T T F s y s = R s y s t 1 + R s y s t 2000 + 2 i = 2 1999 R s y s t i t e n d / 1999

6. Simulation Solution

This solution is applicable for any set of distribution (First Case or Second Case) and for any type of aging of the backup component in standby (full aging, no aging, or partial aging). Any simulation uses multiple pseudo-realities to study the system in question. The information from each generated pseudo-reality will be kept in an EC, whose definition and properties will be discussed in Section 6.1. In Section 6.2 we will concentrate on the development of specific functions generating random time intervals for the 2SBRSBF system. Those functions will be used in Section 6.3 where an algorithm will be developed to generate a random EC describing the 2SBRSBF system. In Section 6.4 we will extract the information in the generated ECs to calculate the state probability functions and the rest of the reliability characteristics of a 2SBRSBF system.

6.1. Definition and Properties of the Event Chains for 2SBRSBF

In the simulational solution, we generate a large count N of pseudo-realities in which we observe the behavior of the 2SBRSBF system from time 0 to system failure or to time tend whichever comes first. As in the numerical solution (described in Section 5) the constant tend is selected sufficiently large, so R s y s t e n d < 0 . 01 . The pseudo-realities are described with the ECs introduced in [22] where the EC of the jth pseudo-reality is defined as the set:
E C j = t i m e p s r j k , s t a t e p s r j k k = 1 , 2 , , q j
The notation in Equation (23) shows that the jth pseudo-reality contains qj state transitions (called events) where the kth consecutive event which happened at time timepsrj(k) is a transition to state/substate timepsrj(k). The latter is coded either with 1, 2, and 3 respectively for State 1, State 2, and State 3, or with 40, 41, 42, and 43 respectively for system failure type b, type a, type c, and type d (all of them denoting State 4). Any EC for a 2SBRSBF system should have the following properties:
p1) It contains at least one event: qj ≥ 1.
p2) The events happen at strictly increasing times: timepsrj(k)< timepsrj(k + 1) for k = 1,2,…,(qj − 1).
p3) The initial event is at time zero: timepsrj(1) = 0.
p4) The final event happens before tend: timepsrj(qj) < tend.
p5) The simulation starts with fully operational system: statepsrj(1) = 1.
p6) Whenever a system failure is observed the simulation ends: if statepsrj(b) > 3, then qj = b.
p7) Whenever the State 3 is observed either it is the last event, or the next event is the system failure type d: if statepsrj(b) = 3, then either qj = b, or qj = (b + 1) and statepsrj(qj) = 43.
p8) The State 3 and the State 4 (in all its substates) can happen only once: #[statepsrj(k) = 3] ≤ 1, #[statepsrj(k) = 40] ≤ 1, #[statepsrj(k) = 41] ≤ 1, #[statepsrj(k) = 42] ≤ 1, #[statepsrj(k) = 43] ≤ 1.
p9) The State 1 and State 2 alternate in the beginning of the EC including to the hth event and neither one happens later: statepsrj(k) = 1 if and only if k is odd and kh, whereas statepsrj(k) = 2 if and only if k is even and kh.
p10) There could be maximum two events after h: hqj ≤ (h + 2).
p11) If there are events after the hth one, they are either a transition to State 3 or a transition to State 4 (in all its substates): statepsrj(k) ≥ 3 for all k > h and kqj.
p12) The State 2 can be observed only on an even position and the previous event is always a transition to State 1: if statepsrj(b) = 2, then b is even and statepsrj(b − 1) = 1.
p13) The State 3 can be observed only on an even position and the previous event is always a transition to State 1: if statepsrj(b) = 3, then b is even and statepsrj(b − 1) = 1.
The formulated EC properties will facilitate the generation of time-period variates presented in Section 6.2. The algorithm described in Section 6.3 will generate ECs with the formulated EC properties. The latter will be used in Section 6.4 to prove the methods for extracting reliability information from the generated set of ECs for 2SBRSBF system.

6.2. Generating Times Periods Using Conditional Distributions from 2SBRSBF

As discussed in Section 3, to simulate an EC of a 2SBRSBF system we need to generate random time-periods complying with the conditional failure distributions f1(τ|t), f3(τ|t), and f2(τ|tage) and with the conditional repair distribution f4(τ|t), where t and tage are non-negative values.
We do not know which of the four original distributions, fk(t) (k = 1,2,3,4), are defined only in the non-negative domain and which are defined in the entire real axes so we need to substitute them with their truncated distributions, fk,trunc(t) = fk(t|0) for k = 1,2,3,4. Noting that if the first condition is met, then fk,trunc(t) = fk(t|0) = fk(t) (k = 1,2,3,4), and we can safely work only with truncated distributions. So, strictly speaking, we need to generate time-period variates from the conditional truncated distributions f1,trun(τ|t), f3,trun(τ|t), f2,trun(τ|tage), and f4,trun (τ|t). However, for any k it is true that:
f k , t r u n τ | t = f k , t r u n τ + t R k , t r u n t = f k τ + t | 0 R k t | 0 = f k τ + t + 0 / R k 0 R k t + 0 / R k 0 = f k τ + t R k t = f k τ | t
According to Equation (24) the conditional truncated distributions coincide with the conditional original distributions. In case t and tage are known entities we can generate random time-period variates as special cases of the Practical Indirect Sampling Method from Conditional CDF (PISMCF) [17] where the algorithm is motivated, formalized, illustrated, and proven. On its basis we can define a three-attribute procedure, PISMCF(.), which generates numerically stable random time interval variate, Δ τ , from a given conditional CDF, F(t|Tsurv), where Tsurv is a non-negative real number representing the time of survival:
Δ τ = PISMCF F . , F 1 . , T s u r v
In Equation (25), F(.) is the unconditional CDF which can express F(t|Tsurv) using Equation (26):
1 F t | T s u r v = 1 F t + T s u r v 1 F T s u r v
The second argument, F–1(.), of the PISMCF procedure from Equation (25) being the inverse CDF, can be used to estimate the time tλ where the denominator of Equation (26) is 100 machine epsilons (ϵ):
t λ = F 1 1 100 ε
In short, the algorithm for estimating Equation (25) is: (a) Calculate tλ using Equation (27); (b) if Tsurv < tλ, then set Tcut = Tsurv, else set Tcut = tλ; (c) Generate RD as a uniformly distributed variate in the unit interval (0,1); (d) estimate pRD = 1 − RD [1 − F(Tcut)]; (e) Set Δ τ = F 1 p R D .
Let us assume that while performing the simulation of the jth pseudo-reality for the 2SBRSBF system we have observed only the first kcur state events. The simulation probably will continue and therefore, the EC is yet incomplete:
E C j i n c = t i m e p s r j k , s t a t e p s r j k k = 1 , 2 , , k c u r
Then, the current state of the system is scur = statesprj(kcur) and the simulational time is Tcur = timesprj(kcur) < tend (see EC property p3). The incomplete EC in Equation (28) is never empty since kcur ≥ 1 (see EC properties p1, p3, and p5).
If scur > 3, we do not need to generate any time-period variates since it shows a system failure, i.e., end of the simulation in the jth pseudo-reality (see EC properties p8, p11, and p6).
If scur is 1, we need to generate two possible time-period variates: the time to failure of the primary unit, Δ τ 1 , f p , and the time to failure in standby of the backup unit, Δ τ 1 , f b . Using Equation (25):
Δ τ 1 , f p = PISMCF F 1 . , F 1 1 . , T c u r
Δ τ 1 , f b = PISMCF F 3 . , F 3 1 . , T c u r
If scur is 3, we do not need to generate any time-period variate since the possible time to failure of the primary unit is known to be Δ τ 1 , f p Δ τ 1 , f b , where Δ τ 1 , f p   and   Δ τ 1 , f b are generated in the previous State 1 (see EC properties p9 and p13).
If scur > 3, we do not need to generate any time-period variate since we have observed a system failure of some type which means that the simulation in the jth pseudo-reality should stop and therefore qj = kcur (see EC properties p6, p8, and p11).
If scur is 2, we need to generate two possible time-period variates: the time to repair of the primary unit, Δ τ 2 , r p , and the time to failure in operation of the backup unit, Δ τ 2 , f b . Using Equation (25):
Δ τ 2 , r p = PISMCF F 4 . , F 4 1 . , T c u r
Δ τ 2 , f b = PISMCF F 2 . , F 2 1 . , t a g e
If the 2SBRSBF operates with First Case distributions, the equivalent age, tage, of the backup unit when it starts operation at time Tcur is rather irrelevant since F2(.) is the CDF of an exponential distribution. Then we can compare the state probability functions derived by the simulational solution with the same acquired, on one hand, from numerical solution with the DAE system from Equation (16) according to the RD in Figure 1b and on the other hand with the analytical solution from Equations (9)–(12) according to the RD in Figure 1a.
If, however, the 2SBRSBF operates with Second Case distributions, then in order to use Equation (32), we have to determine the equivalent age, tage, at time Tcur. Since we need Δ τ 2 , f b only when the system is State 2, it follows that kcur is even (see EC property p9). From the beginning of the jth pseudo-reality up to time Tcur, the backup component has been in standby (kcur/2) times when the primary component was operating till its failure (see EC property p9). Up to Tcur, the backup component has never failed in standby when the primary component was in operation, i.e., during the compound time interval with overall positive length Tsb (see EC properties p6, p8, and p9). The latter time length can be defined using Equation (28) as:
T s b = i = 1 k c u r / 2 t i m e p s r j 2 i t i m e p s r j 2 i 1
On the other hand, the backup component has been in operation (kcur/2 − 1) times, when the primary component was in successful repair (see EC property p9). Up to Tcur, the backup component has never failed during operation when the primary component was successfully repaired, i.e., during the compound time interval with overall non-negative length Toper time (see EC properties p7, p8, and p9). The latter time length can be estimated by noting that up to Tcur the 2SBRSBF system is either in State 1 or in State 2:
T o p e r = T c u r T s b
The non-negative value of tage will be the sum of the backup component operation time, Toper, with the equivalent operating time T e q u with which the backup component would age during the standby time Tsb:
t a g e = T o p e r + T e q u
The equivalent operating time T e q u depends on aging in standby mechanism under which the 2SBRSBF system functions. There are three alternative assumptions for the nature of this aging in standby mechanism: full aging, no aging, or partial aging.
The full aging assumption accepts that the aging of the backup component during standby is the same as during operation (see Figure 2a):
T e q u = T s b t a g e = T o p e r + T s b = T c u r T s b + T s b = T c u r
Under the full aging assumption, a 2SBRSBF can be described with the DAE system from Equation (16) according to the RD in Figure 1b, which allows us to acquire numerical solution for Second Case distributions. The numerical solution can be compared with simulational state probability functions.
The no aging assumption accepts that the backup component during standby never ages (see Figure 2b):
T e q u = 0 t a g e = T o p e r + 0 = T c u r T s b
Under the no aging assumption, a 2SBRSBF cannot be described with a DAE system since no RD adequately reflects the reliability behavior of the 2SBRSBF. For Second Case distributions, the only possible solution is the simulational one.
The partial aging assumption accepts that the backup component in standby ages to the same reliability as the backup component in operation during the equivalent operating time T s b e q u (see Figure 2c):
F 2 , t r u n T s b e q u = F 3 , t r u n T s b
Equation (38) in simplified form was firstly proposed in [22], where it was successfully tested for Two-Component Standby System with Failures in Standby. In a real 2SBRSBF system, the failures of the backup component will be more frequent during operation than during standby which means that F2,trun(t) ≥ F3,trun(t) for any non-negative time t. This inequality, together with Equation (38), assures that practically always T e q u 0 , T s b . Applying Equation (15) twice to Equation (38) we get:
T e q u = F 2 1 p e q u i f   p e q u < 1 100 ε t λ , 2 i f   p e q u 1 100 ε ,   where   p e q u = 1 1 F 2 0 1 F 3 0 1 F 3 T s b
In Equation (39), t λ , 2 is calculated with Equation (13) for k = 2, therefore it uses the ideas in Equation (14) for stable approximation of the equivalent operating time, T e q u , at any T c u r 0 , t e n d for arbitrary incomplete EC from Equation (23) describing the behavior of a 2SBRSBF.
Under the partial aging assumption, the 2SBRSBF cannot be described with a DAE system since no RD adequately reflects the reliability behavior of the 2SBRSBF similarly to the no aging assumption. Again, for Second Case distributions, the only possible solution is the simulational one.
We combined Equations (33)–(37) into a six-attribute procedure, TAGEASS(.), which gives numerically stable estimates for the equivalent age, tage, of the backup unit under any of the three aging assumptions:
t a g e = TAGEASS F 2 . , F 2 1 . , F 3 . , E C j i n c , T c u r , F l a g A
In Equation (40), the variable FlagA is 1, 2 or 3, respectively when the 2SBRSBF operates under the full aging, no aging, or partial aging assumptions. Then Equation (40) can be estimated using Algorithm 1.
Algorithm 1 Equivalent Age Estimation in the jth Pseudo-Reality for 2SBRSBF
1)
Calculate the total standby time of the backup component, Tsb, using (33).
2)
Calculate the total operational time of the backup component, Toper, using (34).
3)
If FlagA = 1 (full aging assumption), then calculate the equivalent operating time, Tequ, using (36).
4)
If FlagA = 2 (no aging assumption), then calculate the equivalent operating time, Tequ, using (37).
5)
If FlagA = 3 (partial aging assumption), then:
5.1)
Calculate the positive constant, t λ , 2 , using (13) with k = 2.
5.2)
Calculate the probability, pequ, using the second part of (39).
5.3)
Calculate the equivalent operating time, Tequ, the first part of (39)
6)
Calculate the equivalent age of the backup component, tage, using (35).

6.3. Event Chain Generation for 2SBRSBF

After developing the procedures for random time-period generation in Section 6.2, we may simulate an EC for the jth pseudo-reality of a 2SBRSBF system which satisfies all EC properties defined in Section 6.1.
The following is given:
(1)
For each k = 1, 2, 3, 4, the original CDFs, Fk(t), defined for any real argument.
(2)
For each k = 1, 2, 3, 4 the original inverse CDFs F k 1 p , defined for any p belonging to the unit interval.
(3)
The probability, pf, for switching failure on demand.
(4)
The probability, pr, for back-switching failure on demand.
(5)
The value of the FlagA, which determines under which aging assumption the 2SBRSBF operates.
(6)
The positive final simulation time, tend, such that R s y s t e n d 0   ( < 0 . 01 ) .
(7)
The consecutive number, j, of the pseudo-reality.
The event chain for the jth pseudo-reality, ECj can be calculated using Algorithm 2.
Algorithm 2 Generation of the Event Chain for the jth Pseudo-Reality of 2SBRSBF
1) 
Initiate the incomplete event chain, E C j i n c :
1.1)
Set, Tcur =0 (the current system time is zero)
1.2)
Set, kcur = 1 (the current count of events is one)
1.3)
Set, timepsrj(kcur) = Tcur (the time of the first event is zero)
1.4)
Set, statepsrj(kcur) = 1 (the system starts from State 1)
2) 
If statepsrj(kcur) = 1 (the system is currently in State 1), then:
2.1)
Estimate, Δ τ 1 , f p = PISMCF F 1 . , F 1 1 . , T c u r (the possible time to failure of the primary unit).
2.2)
Estimate, Δ τ 1 , f b = PISMCF F 3 . , F 3 1 . , T c u r (the possible time to standby failure of the backup unit).
2.3)
If t e n d T c u r + Δ τ 1 , f p and t e n d T c u r + Δ τ 1 , f b (the end of simulation comes first), then:
2.3.1)
Set qj = kcur (the last event count in ECj)
2.3.2)
Set E C j = E C j i n c (the final ECj)
2.3.3)
Stop the Algorithm
2.4)
If Δ τ 1 , f p Δ τ 1 , f b (the primary unit is failing first), then:
2.4.1)
Set kcur= kcur +1 (new event)
2.4.2)
Set Tcur= Tcur + Δ τ 2 , f b (new current system time)
2.4.3)
Set timepsrj(kcur) = Tcur (the time of the new event)
2.4.4)
Generate RN as an evenly distributed number in the unit interval (check which is the new state)
2.4.4.1)
If RN > pf, then statepsrj(kcur) = 2 (i.e., no switching failure, move to State 2)
2.4.4.1)
If RNpf, then statepsrj(kcur) = 41 (i.e., switching failure, move to State 4, type a)
2.5)
If Δ τ 1 , f p > Δ τ 1 , f b (the backup unit is failing first), then:
2.5.1)
Set kcur= kcur + 1 (new event)
2.5.2)
Set Tcur= Tcur + Δ τ 1 , f b (new current system time)
2.5.3)
Set timepsrj(kcur) = Tcur (the time of the new event)
2.5.4)
Set statepsrj(kcur) = 3 (move to State 2)
3) 
If statepsrj(kcur) = 2 (the system is currently in State 2), then:
3.1)
Estimate, Δ τ 2 , r p = PISMCF F 4 . , F 4 1 . , T c u r (the possible time to repair of the primary unit).
3.2)
Estimate t a g e = TAGEASS F 2 . , F 2 1 . , F 3 . , E C j i n c , T c u r , F l a g A (the equivalent age of the backup unit)
3.3)
Estimate, Δ τ 2 , f b = PISMCF F 2 . , F 2 1 . , t a g e (the possible time to operational failure of the backup unit).
3.4)
If t e n d T c u r + Δ τ 2 , r p and t e n d T c u r + Δ τ 2 , f b (the end of simulation comes first), then:
3.4.1)
Set qj = kcur (the last event count in ECj)
3.4.2)
Set E C j = E C j i n c (the final ECj)
3.4.3.)
Stop the Algorithm
3.5)
If Δ τ 2 , r p Δ τ 2 , f b (the primary unit is repaired first), then:
3.5.1)
Set kcur = kcur + 1 (new event)
3.5.2)
Set Tcur = Tcur + Δ τ 2 , r p (new current system time)
3.5.3)
Set, timepsrj(kcur) = Tcur (the time of the new event)
3.5.4)
Generate RN as an evenly distributed number in the unit interval (check which is the new state)
3.5.4.1)
If RN > pr, then statepsrj(kcur) =1 (no back-switching failure, move to State 1)
3.5.4.2)
If RNpr, then statepsrj(kcur) = 40 (back-switching failure, move to State 4, type b)
3.6)
If Δ τ 2 , r p > Δ τ 2 , f b (the backup unit is failing first), then:
3.6.1)
Set, kcur = kcur + 1 (new event)
3.6.2)
Set, Tcur = Tcur + Δ τ 2 , f b (new current system time)
3.6.3)
Set, timepsrj(kcur) = Tcur (the time of the new event)
3.6.4)
Set, statepsrj(kcur) =42 (move to State 4, type c)
4) 
If statepsrj(kcur) = 3 (the system is currently in State 3), then:
4.1)
If t e n d T c u r + Δ τ 1 , f p Δ τ 1 , f b (the end of simulation comes first), then:
4.1.1)
Set, qj= kcur (the last event count in ECj)
4.1.2)
Set, E C j = E C j i n c (the final ECj)
4.1.3)
Stop the Algorithm
4.2)
If t e n d > T c u r + Δ τ 1 , f p Δ τ 1 , f b (the primary unit is failing first), then:
4.2.1)
Set, kcur= kcur+1 (new event)
4.2.2)
Set, Tcur= Tcur+ Δ τ 1 , f p Δ τ 1 , f b (new current system time)
4.2.3)
Set, timepsrj(kcur) = Tcur (the time of the new event)
4.2.4)
Set, statepsrj(kcur) = 43 (switching failure, move to State 4, type d)
5) 
If statepsrj(kcur) > 3 (the system is currently in State 4), then:
5.1)
Set, qj = kcur (the last event count in ECj)
5.2)
Set, E C j = E C j i n c (the final ECj)
5.3)
Stop the Algorithm
6) 
Go to Step 2 (try a next transition)
It is easy to demonstrate that any EC generated by Algorithm 2 satisfies all EC properties formulated in Section 6.1.

6.4. Extracting Reliability Information from the Simulated ECs

Let N be a large positive integer representing the count of the randomly simulated pseudo-realities. Using Algorithm 2, we can simulate ECj, for j = 1,2, …, N. In this section, we will extract the reliability information from the simulated ECs, approach which is the essence of any Monte Carlo simulation [37] (pp. 290–294).
Let us calculate the state probability functions at the 2000 evenly distributed time points from 0 to tend given in Equation (20). For a given ECj we can estimate the state, Sti,j, at each of the time points ti:
S t i , j = s t a t e p s r j k    if   t i m e p s r j k t i < t i m e p s r j k + 1    , for   k < q j s t a t e p s r j q j    if   t i m e p s r j q j t i t e n d ,   where   i = 1 , 2 , , 2000 j = 1 , 2 , , N  
From Equation (41) it is easy to estimate the values of the first three state probability functions at the time point, ti:
P g t i = 1 N # S i , j = g | j = 1 , 2 , , N ,   where   g = 1 , 2 , 3   and   i = 1 , 2 , , 2000
In Equation (42) the # S i , j = g | j = 1 , 2 , , N is the count of all states at the time point ti which are equal to g.
The fourth state probability function can be estimated using Equation (1) as:
P 4 t i = 1 P 3 t i P 2 t i P 3 t i   ,   for   i = 1 , 2 , , 2000
The reliability function and the MTTFsys can be approximated with Equations (21) and (22). According to the ES property p1, the reliability in Equation (22) has decreasing nodes:
R s y s t i R s y s t i + 1   ,   for   i = 1 , 2 , , 1999
One way to identify the α-design life, tdes,α for given α is to transform the nodes, t i , R s y s t i | i = 1 , 2 , , 2000 , of the system reliability from Equation (22) into strictly decreasing purged nodes t i p u , R s y s p u t i p u | i = 1 , 2 , , n p u where:
R s y s p u t i p u > R s y s p u t i p u   ,   for   i = 1 , 2 , , n p u
Such a purging procedure is proposed in [17], where the algorithm is motivated, formalized, illustrated, and proven. In short, it runs in the steps summarized in Algorithm 3.
Algorithm 3 Purging Algorithm
(a)
Identify the time of the first purged node t 1 p u , R s y s p u t 1 p u = 1 as the greatest ti for which R s y s t i = 1 ;
(b)
Substitute all internal nodes with equal reliability with one purged node in the center of the horizontal platform;
(c)
Identify the time of the last purged node t n p u p u , R s y s p u t n p u p u as the smallest ti for which R s y s t i = R s y s t 2000 .
Having the strictly decreasing purged system reliability function, we can identify the α-design life, tdes,α for any α R s y s p u t n p u p u , 1 :
t d e s , α = t i p u + R s y s p u t i p u α t i + 1 p u t i p u R s y s p u t i p u R s y s p u t i + 1 p u ,   for   R s y s p u t i p u α > R s y s p u t i + 1 p u
As discussed in Section 3, the reliability numerical characteristics Mediansys, B1_life, B10_life, and IQRsys can be estimated as t d e s , 0.5 ,   t d e s , 0.99 ,   t d e s , 0.9 ,   and   t d e s , 0.25 t d e s , 0.75 respectively by applying Equation (46) five times.
The simulational solution is universal and exists even when the numerical and analytical solutions are impossible. Even when the numerical and the analytical solutions exist, the simulational solution can provide richer reliability information.
For example, it is obvious that the 2SBRSBF system will have 100% chance to ever be in the State 1. It is also clear that if tend is correctly selected, then the 2SBRSBF system will have more than 99% chance to ever be in the State 4. However, it is interesting to know the chance, P2,ever, for the 2SBRSBF system to ever be in the State 2, since that probability will help us plan the resources needed for the repair of the primary unit. Similarly, the chance, P3,ever, for the 2SBRSBF system to ever be in the State 3 is important, since that will show us the prevalence of the failure in standby of the backup unit. So, for a given 2SBRSBF system, we can estimate the chances, Pg,ever, for g = 1,2,3:
P g , e v e r = 100 N # i ,   that   S i , j = g | j = 1 , 2 , , N ,   where   g = 1 , 2 , 3
In Equation (47), # i ,   that   S i , j = g | j = 1 , 2 , , N is the count of pseudo-realities in which State g can be found at least once. Similarly, for a given 2SBRSBF system we can estimate the chance, P4,ever as:
P 4 , e v e r = 100 N # i ,   that   S i , j > 3 | j = 1 , 2 , , N
In Equation (48), # i ,   that   S i , j > 3 | j = 1 , 2 , , N is the count of pseudo-realities in which State 4 (system failure) can be found at least once.
As another example for reliability information, which can be acquired neither with the numerical, nor with the analytical solution, can be found in the four conditional chances, P g , e v e r c o n d (for g = 40, 41, 42, 43), of the 2SBRSBF system to have respectively type b, type a, type c, or type d system failure, provided that system has failed:
P g , e v e r c o n d = 100 # S i , q j = g | j = 1 , 2 , , N N P 4 , e v e r / 100 ,   where   g = 40 , 41 , 42 , 43
The information in Equation (49) allows to identify the types of system failures which dominate the 2SBRSBF system. That knowledge will increase the efficiency of the reliability improvement measures. Equations (42), (47)–(49) use the frequentist interpretation of probability [40] (pp. 42–43).
Knowing how to simulate an EC for the jth pseudo-reality of a 2SBRSBF system, allows us to develop the simulational solution of a given 2SBRSBF system. We have the following given:
(1)
For each k = 1, 2, 3, 4, the original CDFs, Fk(t), defined for any real argument.
(2)
For each k = 1, 2, 3, 4, the original inverse CDFs F k 1 p , defined for any p belonging to the unit interval.
(3)
The probability, pf, for switching failure on demand.
(4)
The probability, pr, for back-switching failure on demand.
(5)
The value of the FlagA, which determines under which aging assumption the 2SBRSBF operates.
The proposed algorithm in [17] uses simulation to find the reliability characteristics of a two-component standby systems with switching failures and aging in standby. The simulational solution for 2SBRSBF system can be obtained through a generalization of that algorithm, which is formalized as Algorithm 4 below.
Algorithm 4 Simulational Solution of a 2SBRSBF System
1)
Select the count N of pseudo-realities to be simulated as a large integer.
2)
Select the final simulation time, tend, as a positive real number.
3)
Set, j = 1 (initiate the consecutive number of the simulated pseudo-reality)
4)
Generate the ECj, using Algorithm 2.
5)
Set, j = j + 1 (move to next pseudo-reality).
6)
If jN, then go to Step 4 (repeat the EC generation N times).
7)
Estimate 2000 equally spaced times, ti, in the closed interval [0, tend] using Equation (20).
8)
Estimate the states, Sti,j, using Equation (41).
9)
Estimate the first three state probability functions, P g t i   for   g = 1 , 2 , 3 at the time points ti using Equation (42).
10)
Estimate the fourth state probability function, P 4 t i , at the time point ti using Equation (43).
11)
Estimate the system reliability function, R s y s t i at the time point ti using Equation (21).
12)
Estimate the system mean time to failure, M T T F s y s using Equation (22).
13)
Estimate the nodes, t i p u , R s y s p u t i p u | i = 1 , 2 , , n p u , of the invertible reliability function using Algorithm 3.
14)
Estimate the design lives, t d e s , 0.5 ,   t d e s , 0.99 ,   t d e s , 0.9 ,   t d e s , 0.25   and   t d e s , 0.75 using Equation (46) five times.
15)
Set the median time, M e d i a n s y s   =   t d e s , 0.5 .
16)
Set the B1 life, t d e s , 0.99 .
17)
Set the B10 life, t d e s , 0.9 .
18)
Set the interquartile range, I Q R =     t d e s , 0.25   t d e s , 0.75 .
19)
Estimate the first three unconditional chances, P g , e v e r   ( for   g =   1 , 2 , 3 ) using Equation (47).
20)
Estimate the fourth unconditional chance, P4,ever using Equation (48).
21)
Estimate the conditional chances, P g , e v e r c o n d (for g = 40, 41, 42, 43) using Equation (49).
With the formulation of Algorithm 4 the universal simulational solution for a 2SBRSBF system is complete.

7. Illustrative Examples

7.1. Examples Setup

We shall analyze three Illustrative Examples. In all of them, the probability for switching failure is pf = 0.12, whereas the probability for back-switching failure is pr = 0.03. The ratio between those values is plausible for the following reasons. If the switching is successful, it means that the switching device operated properly. Then a back-switching failure is less probable since it will be demanded shortly afterwards (the repair time of the primary component is much smaller than its failure time).
In Example 1, any of the four original distributions has a constant failure/repair rate λk shown in Table 2 (for k = 1,2,3,4). The PDFs of the original exponential distributions are:
f k t = λ k e λ k t ,   for   t 0   where   k = 1 , 2 , 3 , 4
The PDFs, the reliability/repair functions, and the failure/repair rates of the truncated distributions from Equation (50) are plotted in Figure 3. Example 1 will illustrate the behavior of the 2SBRSBF system with First Case distributions. Here, the original and the truncated distributions coincide.
In Example 2, the original distributions are as follows:
(1)
a Rayleigh distribution with shape parameter b1 [41] for the failures of the primary component:
f 1 t = t / b 1 e 0.5 t / b 1 2 ,   for   t 0
(2)
a normal distribution with mean value μ2 h and standard deviation σ2 h [42] for the failures of the backup component in operation:
f 2 t = 1 2 π σ 2 e 0.5 t μ 2 2 / σ 2 2 ,   for   t , +
(3)
a Weibull distribution with a scale parameter θ3 h and a shape parameter β3 [43] for the failures of the backup component in standby:
f 3 t = β 3 θ 3 t θ 3 β 3 1 e t / θ 3 β 3 ,   for   t 0
(4)
a lognormal distribution with median time tmed,4 h and shape parameter s4 [44] for the repairs of the primary component:
f 4 t = 1 2 π s 4 t e 0.5 ln 2 t / t m e d , 4 / s 4 2 ,   for   t 0
The original distribution Example 2 are described in Table 3. The PDFs, the reliability/repair functions, and the failure/repair rates of the truncated distributions from Equations (51)–(54) are plotted in Figure 4. Example 2 will illustrate the behavior of the 2SBRSBF system with Second Case distributions where the failures of the backup component in operation have an Increasing Failure Rate (IFR). Such a typical situation can occur when the operational failure is caused mainly by high wearing in the backup component [11] (pp. 73–75). Here, the original and the truncated distributions coincide except for the f2(t) and f2,trunc(t).
In Example 3 the distributions are the same as in Example 2, except for the second type, which changes to:
2) a lognormal distribution with median time tmed,2 h and shape parameter s2 for the failures of the backup component in operation:
f 2 t = 1 2 π s 2 t e 0.5 ln 2 t / t m e d , 2 / s 2 2 ,   for   t 0
The original distribution Example 3 are described in Table 4. The PDFs, the reliability/repair functions, and the failure/repair rates of the truncated distributions from Equations (51), (53)–(55) are plotted in Figure 5. Example 3 will illustrate the behavior of the 2SBRSBF system with Second Case distributions where the failures of the backup component in operation have a Decreasing Failure Rate (DFR). Such an atypical situation can occur when the operational failure is caused mainly by high child mortality in the backup component [11] (pp. 73–75). Here, the original and the truncated distributions coincide.

7.2. Example 1 Solution

Since in Example 1, we are dealing with First Case distributions, the type of aging has no effect on the reliability performance of the 2SBRSBF system. The simulation solution was obtained by Algorithm 4 with N = 10,000 pseudo-realities for time from 0 to tend = 20,000 h. Four typical pseudo-realities are shown in Table 5 where the different types of system failures are demonstrated. The four state probability functions are shown in Figure 6a–d, respectively. The system reliability function is depicted in Figure 7. The simulation reliability at tend was negligible (as required Rsys(20,000) = 0.0024 < 0.01) which justifies the selection of tend. Important simulation numerical characteristics of the 2SBRSBF reliability can be found in Table 6. The chances of some events of interest (described in Section 6.4) can be found in Table 7. It is revealing so see that the backup component has approximately 69% chance to endure failure in standby (State 3). Another useful fact is that the switching failures (Type a) are more frequent than the backup component failures in operation (Type c) (17% vs. 11% conditional chance). That fact suggests that it is easier to improve the reliability by upgrading the switching mechanism than by upgrading the backup unit.
The simulation results were verified by comparison with the precise analytical solution (see Section 4). According to Table 6, the precise analytical MTTF is 4282 h, whereas the simulational MTTF is estimated as 4294 h, which contains less than 0.3% error.
Also, the simulational results were verified by comparison with the numerical solution (see Section 5), which, as seen from Figure 6 and Figure 7, produced undistinguishable curves from the simulational state probabilities and the simulational reliability function. According to Table 6, the numerical MTTF is 4280 h, whereas the simulational MTTF is estimated as 4294 h. The numerical solution for Example 1 (as well as in Examples 2 and 3) was derived by solving the index-1 DAE system described in Section 5 with the MATLAB multistep procedure ode15s.m. The software successfully integrated the DAE system from 0 to tend = 20,000 h using variable-step method of variable order from 1 to 5 [45].
As seen from Figure 6 and Figure 7, the analytical and the numerical solutions produce undistinguishable curves from the simulational state probabilities and the simulational reliability function. The observed overlap is an essential part of the verification of the presented simulation algorithm: in the case of exponential distribution, the model is Markovian, where the analytical, the numerical, and the simulational solutions should practically coincide.

7.3. Example 2 Solution

Since Example 2 deals with Second Case distributions, the type of aging has an effect on the reliability performance of the 2SBRSBF system. Three simulational solutions were obtained by repeatedly using Algorithm 4 with N = 10,000 pseudo-realities for the three aging assumptions: full aging, no aging and patrial aging of the backup component in standby. Each of those solutions was estimated for time from 0 to tend = 8000 h. The three sets of four state probability functions are shown in Figure 8a–d, respectively. The three system reliability functions are depicted in Figure 9. The simulational reliabilities at tend were negligible and much lower than 0.01 (for full aging- Rsys(8000) = 0; for no aging- Rsys(8000) = 0.0025; for partial aging- Rsys(8000) = 0.0003) which justifies the selection of tend.
Important simulational numerical characteristics of the 2SBRSBF reliabilities can be found in Table 8 for the three types of aging. The chances of some events of interest (described in Section 6.4) can be found in Table 9 for each of the three aging assumptions. It is revealing to see that the backup component has between 31% and 41% chance to endure failure in standby (State 3) depending on the aging model. An interesting dynamic is observed in the conditional chances of observing the different types of failure. At full aging, the backup component failures during primary repair (Type c) have more than 50% chance, whereas the primary component failures after failure in standby (Type d) constitute only around 30% of the system failures. At no aging, the backup failures in operation are less likely and, therefore, the primary component failures after backup failure in standby (Type d) are more frequent than the backup component failures during primary repair (Type c) (41% vs. 36% conditional chance). At the same time, Type c and Type d system failures are marginally the same at partial aging of the backup component in standby (37% vs. 42% conditional chance). Those facts suggest that to increase the reliability of the 2SBRSBF it is of paramount importance correctly to identify the aging mechanism of the backup unit during standby.
In Example 2, the distribution of the backup component failures in operation has an IFR (see the blue line in Figure 5c), indicating that the wear out is the most likely reason for those failures. This is by far the most widespread case in the engineering practice where the backup component operates at the rear end of the bathtub curve [46]. Then, the severity of the aging should increase the failure incidence of the backup component in operation and subsequently should decrease the reliability. As expected, the system reliability function is the best at no-aging and worst at full aging (see Figure 9 for 1500–5500 h). The MTTF increases from 2837 h at full aging, through 3242 h at partial aging, to 3457 h at no aging, which corresponds to substantial 21% improvement. Similar behavior can be observed in the median, B10 life, and at the B1 life (see Table 8). Another expected result is that the state probability functions for partial aging are between the state probability functions for no aging and full aging (see Figure 8). The real distinction between the three curves can be seen in State 2 probability function (Figure 8b) which is very sensitive to the aging mode. The observed forms of the State 2 probability functions are justifiable since the severity of aging increases the incidence of failure of the operational backup unit, which moves the system to State 4 and decreases the probability of the 2SBRSBF to be in State 2. All the above can serve as a qualitative validation of Algorithm 4 for simulating the reliability behavior of the 2SBRSBF system.
Also, the simulational results were quantitatively verified by comparison with the numerical solution (as described in Section 7.2), which, as seen from Figure 8 and Figure 9, produced undistinguishable curves from the simulational state probabilities and the simulational reliability function in the case of full aging of the backup component during standby. This overlap is an important result: under the full-aging assumption the model is semi-Markovian where the numerical, and the simulational solutions should practically coincide. According to Table 8, the numerical MTTF and the simulational MTTF at full aging are estimated to be equal (2837 h). Note that the analytical solution is impossible to be derived in Example 2 since the failure/repair rates are not constant.

7.4. Example 3 Solution

Since Example 3 deals with Second Case distributions, similarly to Example 2, the type of aging has effect on the reliability performance of the 2SBRSBF system. Three simulational solutions were obtained by repeatedly using Algorithm 4 with N = 10,000 pseudo-realities for the three aging assumptions: full aging, no aging and patrial aging of the backup component in standby. Each of those solutions was estimated for time from 0 to tend = 12,000 h. The three sets of four state probability functions are shown in Figure 10a–d, respectively. The three system reliability functions are depicted in Figure 11. The simulational reliabilities at tend were negligible and lower than 0.01 (for full aging - Rsys(12000) = 0.0062; for no aging-Rsys(12000) = 0.0011; for partial aging-Rsys(12,000) = 0.002) which justifies the selection of tend.
Important simulational numerical characteristics of the 2SBRSBF reliabilities can be found in Table 10 for the three types of aging. The chances of some events of interest (described in Section 6.4) can be found in Table 11 for each of the three aging assumptions.
In Example 3, the distribution of the backup component failures in operation has an DFR (see the blue line in Figure 5c), indicating that the child mortality is the most likely reason for those failures. This is a very rare case in the engineering practice where the backup component operates at the front end of the bathtub curve. Then, the severity of the aging should decrease the failure incidence of the backup component in operation and subsequently should increase the reliability. As expected, the system reliability function is the worst at no-aging and best at full aging (see Figure 11 for 2000–8000 h). The MTTF increases from 3139 h at no aging, through 3187 h at partial aging, to 3625 h at full aging, which corresponds to noticeable 16% improvement. Similar behavior can be observed in the median, B10 life, and at the B1 life (see Table 10). Another expected result is that the state probability functions for partial aging are between the state probability functions for no aging and full aging (see Figure 10). The real distinction between the three curves can be seen in State 2 probability function (Figure 10b) which is very sensitive to the aging mode. The observed forms of the State 2 probability functions are justifiable since the severity of aging decreases the incidence of failure of the operational backup unit, which moves the system to State 4 and increases the probability of the 2SBRSBF to be in State 2. All the above can serve as a qualitative validation of Algorithm 4 for simulating the reliability behavior of the 2SBRSBF system.
A partial overlap between the no aging simulation solution and the partial aging simulation solution can be spotted in Figure 10 and Figure 11. The same can also be observed in Figure 8 and Figure 9 to a lesser extent. Those partial overlaps reflect the fact that for almost all realistic distribution sets, under the applied method, the solution of partial aging is much closer to the solution with no aging assumption than to the solution with full aging assumption.
Again, the simulational results were quantitatively verified by comparison with the numerical solution (as described in Section 7.2) which as seen from Figure 10 and Figure 11 produced undistinguishable curves from the simulational state probabilities and the simulational reliability function in the case of full aging of the backup component during standby (for comment on the observed overlap see Section 7.3). According to Table 10, the numerical MTTF and the simulational MTTF at full aging are estimated to be virtually equal (3653 h vs. 3652 h, respectively). Note that the analytical solution is impossible to be derived in Example 2 since the failure/repair rates are not constant.

8. Conclusions

In this paper, we investigated the reliability effect of introducing a primary component minimal repair in a two-component standby system with switching failures and aging in warm-standby. A novel analytical solution was derived for distributions with constant failure/repair rates. Under a full aging assumption of the backup component during standby, an index-1 DAE system of four simultaneous equations with constant mass singular matrix was proposed and solved to numerically approximate the state probability functions and system reliability. A universal simulational algorithm was designed to solve the 2SBFSR system under three types of aging. That algorithm generates pseudo-realities with ECs, which satisfy the newly formulated EC properties for the 2SBFSR system. Novel function to assess the equivalent age of the backup component under arbitrary aging mechanisms was proposed and utilized during the EC generation. The system has a stable operation with any type of distribution. There is a significant practical benefit in the ability of the user to write their own distribution functions, which reflect several modes of failure during operation, several modes of failure during warm-standby, and several modes of repair.
Three numerical examples were elaborated to validate quantitatively and qualitatively the simulational solution. To model the 2SBRSBF system with partial aging in standby, we assumed that that the backup component in standby ages to the same reliability as the backup component in operation. That is a logical and plausible hypothesis that allows to produce a tractable aging model whose results can be treated as best estimate. Even if the real aging mechanism is different the numerical examples show that the partial aging results always will be bounded by the full aging and the no-aging results. That fact allows the designers and the maintenance staff to correctly assess the effect of alternative measures aiming at improving the system reliability even if the precise aging in standby mechanism is known.
Although our model may look too specific and simplified, it is easily scalable. The demonstrated methodology can easily be applied to multiple-component warm-standby system with random configuration. We have not given such an example for purely volume constraints in this work. Any different aging assumptions can be incorporated by modifying Algorithm 1 (hence the function TAGEASS). All aspects and elements of such a multi-component warm-standby system can be found in 2SBRSBF. In such a way, our model is suitable for applications in industrial systems, manufacturing, design of ship electrical and propulsion systems, power plants, etc.
As a direction for future studies, we may study the ways to adapt our procedures to the case of perfect repair [10] and intermediate repair [8], as this work only analyzed the case of minimal repair.

Author Contributions

Conceptualization, K.T. and N.N.; methodology, K.T., N.N. and S.C.; software, K.T. and S.C.; validation, S.C. and G.F.; investigation, K.T., B.M. and G.F.; data curation, B.M. and S.C.; writing—original draft, K.T., S.C. and B.M.; writing—review and editing, N.N., G.F. and B.M.; visualization, K.T., N.N. and S.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data sharing not applicable. No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Given: Let the random variable T be the time to failure (or repair) of a component. Also, let the random event A(T0) be that the component is operational at, or has not been repaired up to, time T0. In fact, T is the deterministic time T0 plus the random time period till the next event (failure or repair). This definition of T is true only for Appendix A. Then:
  • The unconditional Cumulative Distribution Function (CDF) of T is F t , for t 0 , .
  • The unconditional Probability Density Function (PDF) of T is f t , for t 0 , .
  • The unconditional reliability (repair) function of T is R t = 1 F t , for t 0 , .
  • The unconditional failure (repair) rate of T is λ t = f t R t , for t 0 , .
  • The conditional CDF of T if A(T0), is F c o n d τ | T 0 , for τ = t T 0 0 , .
  • The conditional PDF of T if A(T0), is f c o n d τ | T 0 = d F c o n d τ | T 0 d τ , for τ = t T 0 0 , .
  • The conditional reliability (repair) function of T if A(T0), is R c o n d τ | T 0 = 1 F c o n d τ | T 0 , for τ = t T 0 0 , .
  • The conditional failure (repair) rate of T if A( T 0 ), is λ c o n d τ | T 0 = f c o n d τ | T 0 R c o n d τ | T 0 , for τ = t T 0 0 , .
Prove: The unconditional and the conditional failure (repair) rate are equal for any t * T 0 , i.e., λ t * = λ c o n d t * T 0 | T 0 , for t * T 0 , .
Proof. 
The unconditional R(t) and f(t) are given in Figure A1a,c. The relationship between these functions is:
f t = d F t d t = d 1 R t d t = d R t d t   for   t 0 ,
Similarly, the conditional R c o n d τ | T 0 and f c o n d τ | T 0 are given in Figure A1b,d. The relationship between these functions is:
f c o n d τ | T 0 = d F c o n d τ | T 0 d τ = d 1 R c o n d τ | T 0 d τ = d R c o n d τ | T 0 d τ   for τ = t T 0 0 ,
Figure A1. A generic distribution described by: (a) unconditional reliability; (b) conditional reliability; (c) unconditional density; (d) conditional density.
Figure A1. A generic distribution described by: (a) unconditional reliability; (b) conditional reliability; (c) unconditional density; (d) conditional density.
Mathematics 09 02547 g0a1
According to [11] (p. 72), the value of the conditional R c o n d τ | T 0 can be expressed as the ratio of two unconditional values of R t :
R c o n d τ | T 0 = R τ + T 0 R T 0   for τ = t T 0 0 ,
The interdependency between Figure A1a,b illustrates Equation (A3). The constant R T 0 is the height of the red vertical line in Figure A1a.
Let us take the first derivative about τ from Equation (A3) and multiply both sides by negative 1. Then,
d R c o n d τ | T 0 d τ = d d τ R τ + T 0 R T 0   for τ = t T 0 0 ,
Let us simplify the RHS of Equation (A4) using Equation (A1) and utilizing that τ = t T 0 :
d d τ R τ + T 0 R T 0 = 1 R T 0 d R τ + T 0 d τ = 1 R T 0 d R τ + T 0 d τ + T 0 d τ + T 0 d τ = 1 R T 0 d R t d t d t d τ = 1 R T 0 f t d τ + T 0 d τ = f τ + T 0 R T 0 1 = f τ + T 0 R T 0
According to Equation (A2), the LHS of Equation (A4) is f c o n d τ | T 0 . Then, from Equations (A4) and (A5) it follows that:
f c o n d τ | T 0 = f τ + T 0 R T 0   for τ = t T 0 0 ,
The interdependency between Figure A1c,d illustrates Equation (A6). The constant R T 0 is the area of the green patch in Figure A1c, since from Equation (A1) it follows that R T 0 = T 0 f t d t .
The conditional failure (repair) rate of T if A( T 0 ) can be transformed using Equations (A3) and (A6):
λ c o n d τ | T 0 = f c o n d τ | T 0 R c o n d τ | T 0 = f τ + T 0 R T 0 ÷ R τ + T 0 R T 0 = f τ + T 0 R τ + T 0   for τ = t T 0 0 ,
Let’s select a time point t * T 0 . The unconditional failure (repair) rate of T at time t* is:
λ t * = f t * R t *
The nominator and the denominator in Equation (A8) are respectively the heights of the blue lines in Figure A1a,c. The conditional time τ is simply the time t delayed with T0 i . e . ,   t = τ + T 0 .
From here, the relative time moment τ * which coincides with time t* is:
  τ * = t * T 0
Equation (A9) is illustrated by the transition from Figure A1a to Figure A1b, and in the transition from Figure A1c to Figure A1d.
The value of λ c o n d τ | T 0 at relative time point τ * can be easily calculated from Equation (A7) utilizing Equations (A8) and (A9):
λ c o n d t * T 0 | T 0 = f c o n d t * T | T 0 R c o n d t * T | T 0 = λ c o n d τ * | T 0 = f τ * + T 0 R τ * + T 0 = f t * R t * = λ t * ,   for t * T 0 ,
 □.

Appendix B

Given: Let λ1, λ2, λ3, and λ4 be real positive constants, whereas pr and pf are real positive constants less than 1. The real functions P1(t), P2(t), and P3(t) are defined in the Domain t 0 , and satisfy the system from Equation (A11) of three simultaneous ordinary differential equations. The initial conditions of the functions are given in Equation (A12).
d P 1 d t t = λ 1 + λ 3 P 1 t + 1 p r λ 4 P 2 t d P 2 d t t = 1 p f λ 1 P 1 t λ 4 + λ 2 P 2 t d P 3 d t t = λ 3 P 1 t λ 1 P 3 t
P 1 0 = 1 ,   P 2 0 = 0 , P 3 0 = 0
Find:
(a)
The solution of the initial-value problem for P1(t), P2(t), and P3(t) in the Domain t 0 , .
(b)
The functions P 4 t = 1 P 1 t P 2 t P 3 t and R s y s t = 1 P 4 t in the Domain t 0 , .
(c)
The quantity M T T F s y s = 0 R s y s t d t .
Solution:
(a)
Taking Laplace transformation [47] (pp. 331–335) of the three equations in Equation (A11) yields a system of three algebraic equations about the Laplace transforms Y1(s), Y2(s), and Y3(s) of the functions P1(t), P2(t), and P3(t), where s is a complex number known as frequency:
s Y 1 s P 1 0 = λ 1 + λ 3 Y 1 s + 1 p r λ 4 Y 2 s s Y 2 s P 2 0 = 1 p r λ 1 Y 1 s λ 4 + λ 2 Y 2 s s Y 3 s P 3 0 = λ 3 Y 1 s λ 1 Y 3 s
Substituting Equation (A12) into Equation (A13) and simplifying gives:
s + λ 1 + λ 3 Y 1 s 1 p r λ 4 Y 2 s = 1 1 p f λ 1 Y 1 s + s + λ 2 + λ 4 Y 2 s = 0 λ 3 Y 1 s + λ 1 + s Y 3 s = 0
The first two equations in Equation (A14) can be solved for Y1(s), Y2(s) using the Cramer’s rule [48]:
Y 1 s = s + λ 2 + λ 4 s + λ 1 + λ 3 s + λ 2 + λ 4 1 p r 1 p f λ 1 λ 4
Y 2 s = 1 p f λ 1 s + λ 1 + λ 3 s + λ 2 + λ 4 1 p r 1 p f λ 1 λ 4
The denominator in both Equations (A15) and (A16) is a quadratic polynomial with real coefficients 1, K, and C:
s + λ 1 + λ 3 s + λ 2 + λ 4 1 p r 1 p f λ 1 λ 4 = s 2 + 2 K s + C
where the real constants K and C are:
K = λ 1 + λ 2 + λ 3 + λ 4 / 2 C = λ 1 + λ 3 λ 2 + λ 4 1 p f 1 p r λ 1 λ 4
We will prove that the discriminant, Δ , of the quadratic polynomial Equation (A17) is always positive:
Δ = 2 K 2 4 1 C = 2 λ 1 + λ 2 + λ 3 + λ 4 / 2 2 4 λ 1 + λ 3 λ 2 + λ 4 1 p f 1 p r λ 1 λ 4 = λ 1 + λ 2 + λ 3 + λ 4 2 4 λ 1 + λ 3 λ 2 + λ 4 + 4 1 p f 1 p r λ 1 λ 4 λ 1 + λ 2 + λ 3 + λ 4 2 4 λ 1 + λ 3 λ 2 + λ 4 + 4 1 1 1 p r λ 1 λ 4 = λ 1 + λ 2 + λ 3 + λ 4 2 4 λ 1 + λ 3 λ 2 + λ 4 = λ 1 + λ 3 + λ 2 + λ 4 2 4 λ 1 + λ 3 λ 2 + λ 4 = λ 1 + λ 3 2 + λ 2 + λ 4 2 + 2 λ 1 + λ 3 λ 2 + λ 4 4 λ 1 + λ 3 λ 2 + λ 4 = λ 1 + λ 3 2 + λ 2 + λ 4 2 2 λ 1 + λ 3 λ 2 + λ 4 = λ 1 + λ 3 λ 2 + λ 4 2 0 Δ > 0
In Equation (A19) we used that 4 1 p f 1 p r λ 1 λ 4 > 0 since 1 p f > 0 ,   1 p r > 0 ,   λ 1 > 0 ,   and   λ 4 > 0 .
From Equation (A19) it follows that the roots s1 the s2 of the quadratic polynomial Equation (A19) are always real and different:
s 1 , 2 = 2 K ± Δ / 2 = 2 K ± 4 K 2 4 C / 2 = K ± K 2 C
In Equation (A20) we assume that s1 > s2 (i.e., s 1 = K + K 2 C and s 2 = K K 2 C ). It can easily be seen that the constants s1 the s2 are always negative.
Using the quadratic factorization formula together with Equation (A17) the denominator in both Equations (A15) and (A16) can be factored to:
s 2 + 2 K s + C = 1 s s 1 s s 2 = s s 1 s s 2
From Equations (A15)–(A17), and (A21), Y1(s), Y2(s) can be simplified to:
Y 1 s = s + λ 2 + λ 4 s s 1 s s 2
Y 2 s = 1 p f λ 1 s s 1 s s 2
Substituting Equation (A22) in the third equation of Equation (A14) we can find Y3(s):
Y 3 s = λ 3 Y 1 s λ 1 + s = λ 3 s + λ 2 + λ 4 s s 1 s s 2 λ 1 + s
The identified Y1(s), Y2(s), and Y3(s) are rational fractions according to Equations (A22)–(A24). To facilitate the inverse Laplace transform, those rational fractions can be subjected to a partial fraction decomposition [49] (pp. 533–540):
Y 1 s = s + λ 2 + λ 4 s s 1 s s 2 = A 1 s s 1 + B 1 s s 2
The constants A1 and B1 in Equation (A25) are:
A 1 = s 1 + λ 2 + λ 4 s 1 s 2   and   B 1 = s 2 + λ 2 + λ 4 s 2 s 1
Y 2 s = 1 p f λ 1 s s 1 s s 2 = A 2 s s 1 + B 2 s s 2
The constants A2 and B2 in Equation (A27) are:
A 2 = 1 p f λ 1 s 1 s 2   and   B 2 = 1 p f λ 1 s 2 s 1
Y 3 s = λ 3 s + λ 2 + λ 4 s s 1 s s 2 λ 1 + s = A 3 s s 1 + B 3 s s 2 + C 3 λ 1 + s
The constants A3, B3, and C3 in Equation (A29) are:
A 3 = λ 3 s 1 + λ 2 + λ 4 s 1 s 2 λ 1 + s 1   ,   B 3 = λ 3 s 2 + λ 2 + λ 4 s 2 s 1 λ 1 + s 2   ,   and   C 3 = λ 3 λ 1 + λ 2 + λ 4 λ 1 s 1 λ 1 s 2
Now, we can apply the inverse Laplace transform over Equations (A25), (A27), and (A29) and find the solutions P1(t), P2(t), and P3(t) of the stated initial-value problem:
  Domain :   t 0 , P 1 t = A 1 e s 1 t B 1 e s 2 t P 2 t = A 2 e s 1 t B 2 e s 2 t P 3 t = A 3 e s 1 t B 3 e s 2 t + C 3 e λ 1 t
(b)
Using the Equation (A31), the required functions can be simplified to:
Domain : t 0 , P 4 t = 1 P 1 t P 2 t P 3 t = 1 A 1 e s 1 t B 1 e s 2 t A 2 e s 1 t B 2 e s 2 t A 3 e s 1 t B 3 e s 2 t + C 3 e λ 1 t = 1 A 1 e s 1 t + B 1 e s 2 t A 2 e s 1 t + B 2 e s 2 t A 3 e s 1 t + B 3 e s 2 t C 3 e λ 1 t = 1 A 1 + A 2 + A 3 e s 1 t + B 1 + B 2 + B 3 e s 2 t C 3 e λ 1 t
Domain : t 0 , R s y s t = 1 P 4 t = 1 1 A 1 e s 1 t B 1 e s 2 t A 2 e s 1 t B 2 e s 2 t A 3 e s 1 t B 3 e s 2 t + C 3 e λ 1 t = 1 1 + A 1 e s 1 t B 1 e s 2 t + A 2 e s 1 t B 2 e s 2 t + A 3 e s 1 t B 3 e s 2 t + C 3 e λ 1 t = A 1 + A 2 + A 3 e s 1 t B 1 + B 2 + B 3 e s 2 t + C 3 e λ 1 t
(c)
The required improper integral for MTTFsys when the integrand is given by Equation (A33) can be calculated using the following formula:
0 e a t d t = 1 a   where   a > 0
Then,
M T T F s y s = 0 R s y s t d t = 0 A 1 + A 2 + A 3 e s 1 t B 1 + B 2 + B 3 e s 2 t + C 3 e λ 1 t d t = A 1 + A 2 + A 3 0 e s 1 t d t B 1 + B 2 + B 3 0 e s 2 t d t + C 3 0 e λ 1 t d t = A 1 + A 2 + A 3 / s 1 + B 1 + B 2 + B 3 / s 2 + C 3 / λ 1
In the derivation shown in Equation (A35) we applied Equation (A34) three times since s1 < 0, s2 < 0, and (–λ1) < 0.

References

  1. Hausken, K. Strategic defense and attack for series and parallel reliability systems. Eur. J. Oper. Res. 2008, 186, 856–881. [Google Scholar] [CrossRef]
  2. Aghaei, M.; Hamadani, A.Z.; Ardakan, M.A. Redundancy allocation problem for k-out-of-n systems with a choice of redundancy strategies. J. Ind. Eng. Int. 2017, 13, 81–92. [Google Scholar] [CrossRef] [Green Version]
  3. Amari, S.V.; Dill, G. A new method for reliability analysis of standby systems. In Proceedings of the 2009 Annual Reliability and Maintainability Symposium, Fort Worth, TX, USA, 26–29 January 2009; pp. 417–422. [Google Scholar] [CrossRef]
  4. Yuan, L.; Meng, X.-Y. Reliability analysis of a warm standby repairable system with priority in use. Appl. Math. Model. 2011, 35, 4295–4303. [Google Scholar] [CrossRef]
  5. Ardakan, M.A.; Rezvan, M.T. Multi-objective optimization of reliability–redundancy allocation problem with cold-standby strategy using NSGA-II. Reliab. Eng. Syst. Saf. 2018, 172, 225–238. [Google Scholar] [CrossRef]
  6. Li, X.; Zhang, Z.; Wu, Y. Some new results involving general standby systems. Appl. Stoch. Model. Bus. Ind. 2009, 25, 632–642. [Google Scholar] [CrossRef]
  7. Kwiatuszewska-Sarnecka, B. Reliability improvement of large multi-state series-parallel systems. Int. J. Autom. Comput. 2006, 3, 157–164. [Google Scholar] [CrossRef]
  8. Yang, Q.; Zhang, N.; Hong, Y. Reliability Analysis of Repairable Systems with Dependent Component Failures Under Partially Perfect Repair. IEEE Trans. Reliab. 2013, 62, 490–498. [Google Scholar] [CrossRef]
  9. Lindqvist, B.H. On the Statistical Modeling and Analysis of Repairable Systems. Stat. Sci. 2006, 21, 532–551. [Google Scholar] [CrossRef] [Green Version]
  10. Zhang, Y.L. A geometric-process repair-model with good-as-new preventive repair. IEEE Trans. Reliab. 2002, 51, 223–228. [Google Scholar] [CrossRef]
  11. Modarres, M.; Kaminskiy, M.; Krivtsov, V. Reliability Engineering and Risk Analysis: A Practical Guide, 2nd ed.; CRC Press: Boca Raton, FL, USA, 2010; pp. 72–227. ISBN 9780849392474. [Google Scholar]
  12. Badía, F.; Berrade, M.; Cha, J.H.; Lee, H. Optimal replacement policy under a general failure and repair model: Minimal versus worse than old repair. Reliab. Eng. Syst. Saf. 2018, 180, 362–372. [Google Scholar] [CrossRef]
  13. Bao, Y.; Sun, X.; Trivedi, K. A Workload-Based Analysis of Software Aging, and Rejuvenation. IEEE Trans. Reliab. 2005, 54, 541–548. [Google Scholar] [CrossRef]
  14. Nguyen, T.A.; Kim, D.S.; Park, J.S. A Comprehensive Availability Modeling and Analysis of a Virtualized Servers System Using Stochastic Reward Nets. Sci. World J. 2014, 2014, 1–18. [Google Scholar] [CrossRef] [Green Version]
  15. Nguyen, T.A.; Kim, D.S.; Park, J.S. Availability modeling and analysis of a data center for disaster tolerance. Futur. Gener. Comput. Syst. 2016, 56, 27–50. [Google Scholar] [CrossRef]
  16. Loveland, S.; Dow, E.M.; Lefevre, F.; Beyer, D.; Chan, P.F. Leveraging virtualization to optimize high-availability system configurations. IBM Syst. J. 2008, 47, 591–604. [Google Scholar] [CrossRef]
  17. Tenekedjiev, K.; Nikolova, N.; Fan, G.; Symes, M.; Nguyen, O. Simulation algorithms to assess the impact of aging on the reliability of standby systems with switching failures. In Advances in Intelligent Systems Research and Innovation; Book Series: Studies in Systems, Decision and Control; Sgurev, V.V., Jotsov, J., Kacprzyk, J., Eds.; Springer Nature: New York, NY, USA, 2021; Chapter 21; pp. 463–496. [Google Scholar] [CrossRef]
  18. Wells, C.E. Reliability analysis of a single warm-standby system subject to repairable and nonrepairable failures. Eur. J. Oper. Res. 2014, 235, 180–186. [Google Scholar] [CrossRef]
  19. Ebeling, C.E. An Introduction to Reliability and Maintainability Engineering, 2nd ed.; Waveland Press Inc.: Long Grove, IL, USA, 2010; pp. 113–115. ISBN 1-57766-625-9. [Google Scholar]
  20. Ebeling, C.E. An Introduction to Reliability and Maintainability Engineering, 3rd ed.; Waveland Press Inc.: Long Grove, IL, USA, 2019; pp. 87–399. ISBN 978-1478637349. [Google Scholar]
  21. Cha, J.H.; Mi, J.; Yun, W.Y. Modelling a general standby system and evaluation of its performance. Appl. Stoch. Model. Bus. Ind. 2007, 24, 159–169. [Google Scholar] [CrossRef]
  22. Nikolova, N.; Fan, G.; Symes, M.; Tenekedjiev, K. Simulating State-Dependent Systems with Partial Aging in Standby. In Proceedings of the IEEE 10th International Conference on Intelligent Systems (IS’2020), Varna, Bulgaria, 28–30 August 2020; pp. 51–60. [Google Scholar]
  23. Yang, L. Reliability Model for Warm Standby System under Consideration of Replace Time. Int. J. Hybrid. Inf. Technol. 2016, 9, 135–146. [Google Scholar] [CrossRef]
  24. Bhardwaj, R.K.; Kaur, K.; Malik, S.C. Reliability indices of a redundant system with standby failure and arbitrary distribution for repair and replacement times. Int. J. Syst. Assur. Eng. Manag. 2017, 8, 423–431. [Google Scholar] [CrossRef]
  25. Wang, K.-H.; Ke, J.-B.; Lee, W.-C. Reliability and sensitivity analysis of a repairable system with warm standbys and R unreliable service stations. Int. J. Adv. Manuf. Technol. 2007, 31, 1223–1232. [Google Scholar] [CrossRef]
  26. Srinivasan, S.K.; Subramanian, R. Reliability analysis of a three unit warm standby redundant system with repair. Ann. Oper. Res. 2006, 143, 227–235. [Google Scholar] [CrossRef]
  27. Maciel, P.R.M.; Dantas, J.R.; Júnior, R.d.S.M. Markov chains and stochastic Petri nets for availability and reliability modeling. In Reliability Engineering: Methods and Applications; Ram, M., Ed.; CRC Press: Boca Raton, FL, USA, 2020; pp. 127–151. [Google Scholar]
  28. Zhao, X.; Nakagawa, T. An Overview on Failure Rates in Maintenance Policies. In Reliability Engineering; CRC Press: Boca Raton, CL, USA, 2019; pp. 166–196. [Google Scholar]
  29. Kishore, J.; Goel, M.; Khanna, P. Understanding survival analysis: Kaplan-Meier estimate. Int. J. Ayurveda Res. 2010, 1, 274–278. [Google Scholar] [CrossRef] [Green Version]
  30. MATLAB. MATLAB R2019a and Statistics and Machine Learning Toolbox 11.5.; The MathWorks Inc.: Natick, MA, USA, 2019. [Google Scholar]
  31. Nikolova, N.; Toneva, D.; Tsonev, Y.; Burgess, B.; Tenekedjiev, K. Novel Methods to Construct Empirical CDF for Continuous Random Variables using Censor Data. In Proceedings of the IEEE 10th International Conference on Intelligent Systems (IS’2020), Varna, Bulgaria, 28–30 August 2020; pp. 61–68. [Google Scholar]
  32. Fuzzy Rationality in Quantitative Decision Analysis. J. Adv. Comput. Intell. Intell. Inform. 2005, 9, 65–69. [CrossRef]
  33. Nikolova, N.D.; Dimitrakicv, D.; Tenekedjiev, K.I. Fuzzy rationality in the elicitation of subjective quantiles. In Proceedings of the Second International IEEE Conference on Intelligent Systems IS’2004, Varna, Bulgaria, 22–24 June 2004; Volume 3, pp. 32–34. [Google Scholar]
  34. Attar, A.; Raissi, S.; Khalili-Damghani, K. A simulation-based optimization approach for free distributed repairable multi-state availability-redundancy allocation problems. Reliab. Eng. Syst. Saf. 2017, 157, 177–191. [Google Scholar] [CrossRef]
  35. Williams, J.D.; Young, S. Partially observable Markov decision processes for spoken dialog systems. Comput. Speech Lang. 2007, 21, 393–422. [Google Scholar] [CrossRef]
  36. Distefano, S.; Trivedi, K.S. Non-Markovian State-Space Models in Dependability Evaluation. Qual. Reliab. Eng. Int. 2013, 29, 225–239. [Google Scholar] [CrossRef]
  37. Birolini, A. Reliability Engineering: Theory and Practice, 8th ed.; Springer: New York, NY, USA, 2017; pp. 290–526. [Google Scholar]
  38. Darling, R.; Norris, J. Differential equation approximations for Markov chains. Probab. Surv. 2008, 5, 37–79. [Google Scholar] [CrossRef] [Green Version]
  39. MATLAB. MATLAB R2019a; The MathWorks: Natick, MA, USA, 2019. [Google Scholar]
  40. Nikolaidis, E.; Mourelatos, Z.P.; Pandey, V. Design Decisions under Uncertainty with Limited Information; CRC Press: Boca Raton, CA, USA, 2011; pp. 42–43. ISBN 9781138115095. [Google Scholar]
  41. Merovci, F.; Elbatal, I. Weibull Rayleigh distribution: Theory and applications. Appl. Math. Inf. Sci. 2015, 9, 1–11. [Google Scholar] [CrossRef]
  42. Horrace, W.C. Moments of the truncated normal distribution. J. Prod. Anal. 2015, 43, 133–138. [Google Scholar] [CrossRef]
  43. Kızılersü, A.; Kreer, M.; Thomas, A.W. The Weibull distribution. Significance 2018, 15, 10–11. [Google Scholar] [CrossRef]
  44. Mouri, H. Log-normal distribution from a process that is not multiplicative but is additive. Phys. Rev. E 2013, 88, 042124. [Google Scholar] [CrossRef] [Green Version]
  45. Shampine, L.F.; Reichelt, M.W.; Kierzenka, J.A. Solving Index-1 DAEs in MATLAB and Simulink. SIAM Rev. 1999, 41, 538–552. [Google Scholar] [CrossRef]
  46. Jiang, R. A new bathtub curve model with a finite support. Reliab. Eng. Syst. Saf. 2013, 119, 44–51. [Google Scholar] [CrossRef]
  47. James, G.; Dyke, P. Advanced Modern Engineering Mathematics, 5th ed.; Pearson Education: Hoboken, NJ, USA, 2018; pp. 331–335. ISBN 9781292174341. [Google Scholar]
  48. Habgood, K.; Arel, I. A condensation-based application of Cramer’s rule for solving large-scale linear systems. J. Discret. Algorithms 2012, 10, 98–109. [Google Scholar] [CrossRef] [Green Version]
  49. Stewart, J. Calculus, Metric Version, 8th ed.; Cengage Learning: Boston, MA, USA, 2015; pp. 533–540. ISBN 9781473742437. [Google Scholar]
Figure 1. Rate diagram for 2SBRSBF with: (a) First Case distributions; (b) Second Case distributions with full aging in standby.
Figure 1. Rate diagram for 2SBRSBF with: (a) First Case distributions; (b) Second Case distributions with full aging in standby.
Mathematics 09 02547 g001
Figure 2. Identification of the equivalent aging time for different aging assumptions: (a) under full aging; (b) under no aging; (c) under partial aging.
Figure 2. Identification of the equivalent aging time for different aging assumptions: (a) under full aging; (b) under no aging; (c) under partial aging.
Mathematics 09 02547 g002aMathematics 09 02547 g002b
Figure 3. The truncated distributions in Example 1. The three failure distributions are shown in section (ac), whereas the repair distribution is shown in section (df). The reliability/repair functions, the PDFs and the failure/repair rates are given respectively in the first (sections (a,d)), the second (section (b,e)), and the third row (sections (c,f)).
Figure 3. The truncated distributions in Example 1. The three failure distributions are shown in section (ac), whereas the repair distribution is shown in section (df). The reliability/repair functions, the PDFs and the failure/repair rates are given respectively in the first (sections (a,d)), the second (section (b,e)), and the third row (sections (c,f)).
Mathematics 09 02547 g003
Figure 4. The truncated distributions in Example 2. The three failure distributions are shown in section (ac), whereas the repair distribution is shown in section (df). The reliability/repair functions, the PDFs and the failure/repair rates are given respectively in the first (sections (a,d)), the second (section (b,e)), and the third row (sections (c,f)).
Figure 4. The truncated distributions in Example 2. The three failure distributions are shown in section (ac), whereas the repair distribution is shown in section (df). The reliability/repair functions, the PDFs and the failure/repair rates are given respectively in the first (sections (a,d)), the second (section (b,e)), and the third row (sections (c,f)).
Mathematics 09 02547 g004
Figure 5. The truncated distributions in Example 3. The three failure distributions are shown in section (ac), whereas the repair distribution is shown in section (df). The reliability/repair functions, the PDFs and the failure/repair rates are given respectively in the first (sections (a,d)), the second (section (b,e)), and the third row (sections (c,f)).
Figure 5. The truncated distributions in Example 3. The three failure distributions are shown in section (ac), whereas the repair distribution is shown in section (df). The reliability/repair functions, the PDFs and the failure/repair rates are given respectively in the first (sections (a,d)), the second (section (b,e)), and the third row (sections (c,f)).
Mathematics 09 02547 g005
Figure 6. State probability functions for Example 1 (with states 1 through 4 given in sections (ad) respectively) from the analytical, numerical and simulation solution.
Figure 6. State probability functions for Example 1 (with states 1 through 4 given in sections (ad) respectively) from the analytical, numerical and simulation solution.
Mathematics 09 02547 g006
Figure 7. Reliability functions for Example 1 from the analytical, numerical and simulation solution.
Figure 7. Reliability functions for Example 1 from the analytical, numerical and simulation solution.
Mathematics 09 02547 g007
Figure 8. State probability functions for Example 2 (with states 1 through 4 given in sections (ad) respectively) under the three aging assumptions.
Figure 8. State probability functions for Example 2 (with states 1 through 4 given in sections (ad) respectively) under the three aging assumptions.
Mathematics 09 02547 g008
Figure 9. Reliability functions for Example 2 under the three aging assumptions.
Figure 9. Reliability functions for Example 2 under the three aging assumptions.
Mathematics 09 02547 g009
Figure 10. State probability functions for Example 3 (with states 1 through 4 given in sections (ad) respectively) under the three aging assumptions.
Figure 10. State probability functions for Example 3 (with states 1 through 4 given in sections (ad) respectively) under the three aging assumptions.
Mathematics 09 02547 g010
Figure 11. Reliability functions for Example 3 under the three aging assumptions.
Figure 11. Reliability functions for Example 3 under the three aging assumptions.
Mathematics 09 02547 g011
Table 1. Overview of the state-of-the-art publications in the warm-standby area.
Table 1. Overview of the state-of-the-art publications in the warm-standby area.
ReferenceArbitrary Failure DistributionArbitrary Repair DistributionSwitching FailureAgingRepairRepair TypeDynamic Solution
[19]noN/AyesnonoN/Ayes
[20] (pp. 167–170)yesN/AnoyesnoN/Ayes
[21]yesN/AnoyesnoN/Ayes
[6]yesN/AnoyesnoN/Ayes
[22]yesN/AnoyesnoN/Ayes
[17]yesN/AyesyesnoN/Ayes
[23]nonononoyesAGANno
[4]nonoyesnoyesAGANyes
[24]noyesnonoyesAGANno
[25]nonononoyesAGANyes
[18]noyesnonoyesAGANno
[26]yesyesnonoyesAGANyes
Current studyyesyesyesyesyesWTOyes
Table 2. Description of the original distributions in Example 1.
Table 2. Description of the original distributions in Example 1.
Component—Event—ModeDistributionParameters
Primary Component FailureExponentialλ1 = 0.0005 failures/h
Backup Component Failure in operation Exponentialλ2 = 0.0008 failures/h
Backup Component Failure in standby Exponentialλ3 = 0.00025 failures/h
Primary Component RepairExponentialλ4 = 0.008 failures/h
Table 3. Description of the original distributions in Example 2.
Table 3. Description of the original distributions in Example 2.
Component—Event—ModeDistributionParameters
Primary Component FailureRayleighb1 = 1600 h
Backup Component Failure in operation normalμ2 = 1000 h and σ2 = 900 h
Backup Component Failure in standby Weibullθ3 = 4500 h and β3 = 2.2
Primary Component Repairlognormaltmed,4 = 90 h and s4 = 0.8
Table 4. Description of the original distributions in Example 3.
Table 4. Description of the original distributions in Example 3.
Component—Event—ModeDistributionParameters
Primary Component FailureRayleighb1 = 1600 h
Backup Component Failure in operation lognormaltmed,2 = 537 h and s2 = 1.3
Backup Component Failure in standby Weibullθ3 = 4500 h and β3 = 2.2
Primary Component Repairlognormaltmed,4 = 90 h and s4 = 0.8
Table 5. Four typical pseudo-realities from Example 1.
Table 5. Four typical pseudo-realities from Example 1.
1Time 0.0 h: Start of the simulation. The primary component operates, the backup component is ready.Time 0.0 h: Start of the simulation. The primary component operates, the backup component is ready.Time 0.0 h: Start of the simulation. The primary component operates, the backup component is ready.Time 0.0 h: Start of the simulation. The primary component operates, the backup component is ready.
2Time 1378.3 h: The primary component fails in operation. The primary component under repair, the backup component operates.Time 1753.2 h: The primary component fails in operation. The primary component under repair, the backup component operates.Time 2016.9 h: The primary component fails in operation. The primary component under repair, the backup component operates.Time 2348.6 h: The primary component fails in operation. The primary component under repair, the backup component operates.
3Time 1467.6 h: The primary component successfully repaired. The primary component operates, the backup component is ready.Time 1821.4 h: The primary component successfully repaired. The primary component operates, the backup component is ready.Time 2042.9 h: The primary component successfully repaired. The primary component operates, the backup component is ready.Time 2406.5 h: The primary component successfully repaired. The primary component operates, the backup component is ready.
4Time 2099.6 h: The primary component fails in operation. Switching failure. Type a system failure (switching failure).Time 4321.5 h: The primary component fails in operation. The primary component under repair, the backup component operates.Time 8168.7 h: The primary component fails in operation. The primary component under repair, the backup component operates.Time 3057.8 h: The backup component fails in standby. The primary component operates, the backup component failed in standby.
5 Time 4460.6 h: The primary component successfully repaired. Back-Switching failure. Type b system failure (back-switching failure).Time 8288.8 h: The backup component fails in operation. Type c system failure (backup component failure during primary repair).Time 3712.4 h: The primary component fails in operation. Type d system failure (standby failure+ primary failure).
Table 6. Reliability characteristics of the 2SBRSBF from Example 1.
Table 6. Reliability characteristics of the 2SBRSBF from Example 1.
Count of pseudo-realities100,000
Simulation time2.000 × 10+4 h
Mean value (Simulation)4.294 × 10+3 h
Median3.418 × 10+3 h
Interquartile range4.187 × 10+3 h
B10 life7.922 × 10+2 h
B1 life1.174 × 10+2 h
Mean value (Analytical)4.282 × 10+3 h
Mean value (Numerical)4.280 × 10+3 h
Table 7. Chances for events of interest in % for Example 1.
Table 7. Chances for events of interest in % for Example 1.
Unconditional Chance for State 1 to happen100.00%The primary component operates, the backup component is ready
Unconditional Chance for State 2 to happen58.66%The primary component under repair, the backup component operates
Unconditional Chance for State 3 to happen68.98%The primary component operates, the backup component failed in standby
Unconditional Chance for State 4 to happen99.76%System failure
Conditional chance for type a failure to happen16.68%Switching failure
Conditional chance for type b failure to happen3.25%Back-switching failure
Conditional chance for type c failure to happen11.07%Backup component failure during primary repair
Conditional chance for type d failure to happen69.00%Standby failure + primary failure
Table 8. Reliability characteristics of the 2SBRSBF from Example 2 under the three aging assumptions.
Table 8. Reliability characteristics of the 2SBRSBF from Example 2 under the three aging assumptions.
Full AgingNo AgingPartial Aging
Count of pseudo-realities100,000100,000100,000
Simulation time8.000 × 10+3 h8.000 × 10+3 h8.000 × 10+3 h
Mean value (Simulation)2.837 × 10+3 h3.457 × 10+3 h3.242 × 10+3 h
Median2.783 × 10+3 h3.364 × 10+3 h3.199 × 10+3 h
Interquartile range1.531 × 10+3 h2.051 × 10+3 h1.785 × 10+3 h
B10 life1.419 × 10+3 h1.596 × 10+3 h1.581 × 10+3 h
B1 life5.536 × 10+2 h5.735 × 10+2 h5.818 × 10+2 h
Mean value (Analytical)NANANA
Mean value (Numerical)2.837 × 10+3 hNANA
Table 9. Chances in % for events of interest for Example 2 under the three aging assumptions.
Table 9. Chances in % for events of interest for Example 2 under the three aging assumptions.
Full AgingNo AgingPartial Aging
Unconditional Chance for State 1 to happen100.00%100.00%100.00%The primary component operates, the backup component is ready
Unconditional Chance for State 2 to happen71.76%71.78%72.00%The primary component under repair, the backup component operates
Unconditional Chance for State 3 to happen30.43%40.78%37.13%The primary component operates, the backup component failed in standby
Unconditional Chance for State 4 to happen100.00%99.75%99.97%System failure
Conditional chance for type a failure to happen15.56%20.11%18.19%Switching failure
Conditional chance for type b failure to happen1.85%3.30%2.70%Back-switching failure
Conditional chance for type c failure to happen52.16%35.75%41.99%Backup component failure during primary repair
Conditional chance for type d failure to happen30.43%40.85%37.12%Standby failure + primary failure
Table 10. Reliability characteristics of the 2SBRSBF from Example 3 under the three aging assumptions.
Table 10. Reliability characteristics of the 2SBRSBF from Example 3 under the three aging assumptions.
Full AgingNo AgingPartial Aging
Count of pseudo-realities100,000100,000100,000
Simulation time1.200 × 10+4 h1.200 × 10+4 h1.200 × 10+4 h
Mean value (Simulation)3.652 × 10+3 h3.139 × 10+3 h3.197 × 10+3 h
Median3.317 × 10+3 h2.957 × 10+3 h2.957 × 10+3 h
Interquartile range2.400 × 10+3 h1.924 × 10+3 h2.015 × 10+3 h
B10 life1.432 × 10+3 h1.364 × 10+3 h1.336 × 10+3 h
B1 life4.870 × 10+2 h5.031 × 10+2 h4.999 × 10+2 h
Mean value (Analytical)NANANA
Mean value (Numerical)3.653 × 10+3 hNANA
Table 11. Chances in % for events of interest for Example 3 under the three aging assumptions.
Table 11. Chances in % for events of interest for Example 3 under the three aging assumptions.
Full AgingNo AgingPartial Aging
Unconditional Chance for State 1 to happen100.00%100.00%100.00%The primary component operates, the backup component is ready
Unconditional Chance for State 2 to happen71.75%71.83%71.80%The primary component under repair, the backup component operates
Unconditional Chance for State 3 to happen44.31%35.71%36.55%The primary component operates, the backup component failed in standby
Unconditional Chance for State 4 to happen99.38%99.89%99.80%System failure
Conditional chance for type a failure to happen21.53%17.73%18.31%Switching failure
Conditional chance for type b failure to happen3.88%2.48%2.68%Back-switching failure
Conditional chance for type c failure to happen30.04%44.06%42.39%Backup component failure during primary repair
Conditional chance for type d failure to happen44.56%35.74%36.62%Standby failure + primary failure
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Tenekedjiev, K.; Cooley, S.; Mednikarov, B.; Fan, G.; Nikolova, N. Reliability Simulation of Two Component Warm-Standby System with Repair, Switching, and Back-Switching Failures under Three Aging Assumptions. Mathematics 2021, 9, 2547. https://doi.org/10.3390/math9202547

AMA Style

Tenekedjiev K, Cooley S, Mednikarov B, Fan G, Nikolova N. Reliability Simulation of Two Component Warm-Standby System with Repair, Switching, and Back-Switching Failures under Three Aging Assumptions. Mathematics. 2021; 9(20):2547. https://doi.org/10.3390/math9202547

Chicago/Turabian Style

Tenekedjiev, Kiril, Simon Cooley, Boyan Mednikarov, Guixin Fan, and Natalia Nikolova. 2021. "Reliability Simulation of Two Component Warm-Standby System with Repair, Switching, and Back-Switching Failures under Three Aging Assumptions" Mathematics 9, no. 20: 2547. https://doi.org/10.3390/math9202547

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop