Joint Optimization of Condition-Based Maintenance and Performance Control for Linear Multi-State Consecutively Connected Systems

Wang, Jun; Wang, Yuyang; Fu, Yuqiang

doi:10.3390/math11122724

Open AccessArticle

Joint Optimization of Condition-Based Maintenance and Performance Control for Linear Multi-State Consecutively Connected Systems

by

Jun Wang

¹,

Yuyang Wang

^1,*

and

Yuqiang Fu

²

¹

International Business School, Beijing Foreign Studies University, Beijing 100089, China

²

School of Mathematics and Physics, University of Science and Technology Beijing, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Mathematics 2023, 11(12), 2724; https://doi.org/10.3390/math11122724

Submission received: 25 May 2023 / Revised: 14 June 2023 / Accepted: 14 June 2023 / Published: 15 June 2023

(This article belongs to the Special Issue System Reliability and Quality Management in Industrial Engineering)

Download

Browse Figure

Versions Notes

Abstract

Industrial systems such as signal relay stations and oil pipeline systems can be modeled as linear multi-state consecutively connected systems, which comprise sequentially ordered elements and fail if the first and the final elements are not connected. The performance level of each element is controllable, which determines how many elements an element can connect and affects its degradation rate. Accumulated degradation can cause element failure, which may lead to costly system failure. This paper aims to minimize long-term maintenance-related costs, including system failure costs. We provide optimal maintenance planning and performance control for every system degradation state through Markov decision process modeling and a dynamic programming algorithm. Load-sharing, restricted maintenance capacity, maintenance setup costs, and the structural characteristics of the system are considered in the model, all of which influence the optimal maintenance and performance control policy. Regarding degradation management, reducing the difference in degradation levels between elements, e.g., replacing more-degraded elements first, can be cost-effective. However, increasing the difference in degradation by maintenance or performance control can also lower maintenance-related costs in specific situations, which is discussed in numerical experiments. We also illustrate structural insights regarding the proposed model, including sensitivity analyses of maintenance capacity, setup costs, and the difference between preventive and corrective replacement costs.

Keywords:

condition-based maintenance; condition-based performance control; linear multi-state consecutively connected systems; load-sharing; Markov decision process

MSC:

90B25; 90C39; 90C40

1. Introduction

Many multi-element systems employ redundancy to provide high reliability, and one of the commonly used redundant systems is the linear multi-state consecutively connected system (LMCCS) [1,2]. An LMCCS contains multiple consecutively connected elements, each with several performance levels to connect itself and the following elements. An element cannot provide connections to any other elements when it fails, and the system fails when the first and last elements are not connected. The LMCCS generalizes linear consecutive-k-out-of-n:F systems [1,3] and is widely applied in engineering projects, such as petroleum pipeline systems, signal relay stations, street illumination systems, and logistics systems. Elements in such systems degrade over time, leading to costly system failure. Therefore, this paper focuses on degradation management for LMCCSs to lower maintenance-related costs, including the system failure cost.

Take an oil pipeline system as an example of the LMCCS, where a sequence of pump stations transports crude oil from an oilfield to a refinery. The system is designed to be redundant, and the output power (performance level) of pump stations is controllable. Hence, if a station fails, operators can improve the performance of stations before the failed station to guarantee steady oil transportation. However, a higher output power leads to a higher degradation rate. At the same time, the operator can actively lower the performance of a highly degraded pump station to reduce its failure risk. To summarize, the system redundancy and the controllability of performance provide opportunities to decrease operation and maintenance costs through performance control. Therefore, we consider condition-based performance control (CBP), i.e., controlling the performance level of each element based on the system degradation state. In addition, timely maintenance brings an element to a better or brand-new state, preventing system failure or decreasing system downtime. Thus, we adopt condition-based maintenance (CBM), i.e., determining maintenance for the elements based on the system degradation state.

The CBP decision affects the degradation of the elements and further the CBM decision. Meanwhile, CBM brings failed elements back to the functional state, which influences the CBP decision in that only the performance level of a functional element is controllable. That is, the CBP and CBM both affect the degradation of the elements and further affect each other. Therefore, the CBP and CBM should be simultaneously optimized. We also consider the impact of maintenance capacity on the CBP and CBM. Although higher capacity guarantees lower system failure risk, it needs more resources, e.g., a large maintenance team and a high spare parts inventory. Therefore, considering the restriction of the maintenance capacity, we focus on the joint optimization of condition-based maintenance and performance control (CBMP) for LMCCSs. Specifically, we assume that all the elements in an LMCCS are functionally identical and degrade continuously over time according to a gamma process, and the degradation rate of an element increases when its performance level increases. The performance level of an element indicates how many subsequent nodes it can connect and is controllable if the element is not failed. The LMCCS is inspected periodically, on which the optimal performance control and maintenance decisions are made to minimize the related total cost. Finally, the CBMP problem is modeled by the Markov decision process (MDP) and solved by a dynamic programming algorithm, through which the optimal CBMP policy, i.e., the optimal performance control and maintenance actions for each system degradation state, is provided.

The performance control in CBMP optimization also covers the load-sharing in redundant systems. When an element fails, the system with redundancy keeps operating, but the remaining elements should raise their performance to keep the same system performance. The elements taking an additional load, i.e., in high performance, degrade faster. In LMCCSs, including the linear consecutive-k-out-of-n:F system, the load increment caused by element failure can only be shared with elements before the failed one. In addition, in LMCCSs, such as oil pipeline systems and signal relay networks, operators can choose which element(s) to take an additional load by raising performance. Therefore, the degradation processes of elements are dependent, and operators can manipulate such degradation dependence. The CBMP optimization can reach optimal load-sharing, i.e., effectively assign the load (control the performance) among functional elements.

Based on the numerical studies in this paper, we find that the restricted maintenance capacity, the load-sharing, and the structural characteristics of LMCCSs can lead to a reverse balance of degradation levels between elements. The reverse balance in degradation management means increasing the difference in degradation levels between elements. The existing literature has studied the balance of degradation, which decreases the difference in degradation levels between elements. In contrast, our numerical results show that the reverse balance can also be cost-effective in specific situations, where more-degraded elements take over the load from less-degraded ones, or highly degraded elements rather than failed ones are prioritized for replacement. Meanwhile, the CBMP policy proposed in this paper is proven to outperform a benchmark policy, where predetermined performance control and CBM are considered. In addition, sensitivity analyses are performed on the maintenance capacity, setup cost, and the difference between preventive and corrective replacement costs to give more management insights into the LMCCSs.

As described above, our contributions are summarized in the following four aspects: (1) CBMP, an efficient management technique, is proposed for LMCCSs. As far as we know, few studies researched CBM or CBMP for LMCCSs. (2) The load-sharing in LMCCSs is investigated. Our research methodology and findings also apply to linear consecutive-k-out-of-n:F systems. (3) We provided innovative insights into CBMP. To our knowledge, little existing literature considers the maintenance capacity’s substantial influence on CBMP. We studied the CBMP policy under the combined effect of the restricted maintenance capacity, the load-sharing, the maintenance setup cost, and the structural characteristics of LMCCSs. (4) We examined the value and rationale of the reverse balance of degradation levels between elements while balancing degradation is usual in degradation management. The optimal CBMP policy takes advantage of both the balance and the reverse balance.

The remainder of the paper is structured as follows. Section 2 discusses the related literature. Section 3 elaborates on the LMCCS and the CBMP policy, where we also demonstrate assumptions concerning the maintenance capacity, maintenance costs, and the system failure cost. Section 4 presents the formulation of the MDP model and the dynamic programming algorithm. Section 5 provides numerical experiments of a five-element LMCCS. The main experiment, comparison experiment, and sensitivity analysis are performed to acquire insights into the optimal CBMP policy, the balance, and the reverse balance. Finally, Section 6 concludes the paper and offers future research directions.

2. Literature Review

The literature concerning the structure of LMCCSs and reliability optimization for LMCCSs is reviewed in Section 2.1. Studies concerning degradation management are investigated in Section 2.2, where three methods of degradation management, four types of inter-element dependence, and the balance and reverse balance of degradation levels are discussed. Finally, Section 2.3 examines related methodologies.

2.1. Literature Regarding Linear Multi-State Consecutively Connected Systems

The LMCCS was first proposed by Hwang and Yao [1] to generalize linear consecutive-k-out-of-n:F systems which fail if and only if at least

k

,

k \in ℕ^{+}

, consecutive elements fail [3]. An LMCCS consists of successively ordered nodes where elements provide connections between nodes according to each element’s performance level. In traditional reliability research on LMCCSs [2,4,5], the performance level of an element is a random integer from 0 to

k

, and the operator cannot manipulate the level. In contrast, in our study, the operator can adjust the performance level of a functional element to any integer from 0 to

k

. By the term functional, we mean the element is not failed, but it does not denote a positive performance level. Therefore, zero performance does not equal element failure.

The maintenance and allocation policy for LMCCSs has been discussed by Peng et al. [4]. They adopted an age-based preventive replacement and minimal repair policy. To our knowledge, little literature has explored CBM for LMCCS. Endharta and Yun [6] explored CBM for linear consecutive-k-out-of-n:F systems. However, their study is based on the number of failed elements, not the system degradation state. In their research, preventive maintenance is scheduled if the system state is worse than a threshold. Once the maintenance is scheduled, it is performed after an interval, the length of which is their decision variable, while the threshold is predetermined. We, by contrast, do not assume a predetermined maintenance policy such as the abovementioned threshold.

Concerning load-sharing in redundant systems, Olde Keizer et al. [7] have researched CBM for parallel systems with load-sharing. They assume the degradation rate of each element depends on the number of functional elements. In an LMCCS, by contrast, the extra load can only be shared with several elements before the failed one. It is the same in a linear consecutive-k-out-of-n:F system. Such constraints on load-sharing in LMCCSs cannot be observed in some other redundant systems, e.g., specific k-out-of-n systems [8,9,10] and multi-unit production systems [11]. A higher performance level indicates a higher load and a faster degradation rate. It is possible to predetermine a load-sharing rule: if an element fails, its closest preceding functional element takes the load. However, our numerical results show that CBP outperforms the predetermined load-sharing. In other words, giving different load-sharing arrangements according to different system degradation states is more cost-effective. Uit het Broek et al. [12] also researched the condition-based load-sharing policy, but they focused on two-element systems. We studied the condition-based load-sharing (performance control) policy under the constraints of LMCCSs.

2.2. Literature Regarding Degradation Management

Then, we review three common ways of degradation management in the order of performance control, maintenance, and reassignment. Geurtsen et al. [13] reviewed maintenance and production scheduling, where the operator regulates the number of machines working each period [14,15,16]. This type of literature researches performance control with two performance levels, working and being idle. Some industrial equipment has only two performance levels, but there is also equipment with more levels, which CBMP fits. In addition, An et al. [17] acknowledged the significance of performance control. In their research, when a new machine is inserted into the shop floor, its performance level is selected, but afterward, the machine can only switch between working and idle. In contrast, we can adjust performance with more levels at every decision epoch, taking more advantage of the performance control.

Regarding controlling more than two performance levels, Uit het Broek et al. [18] studied condition-based production for single-element systems. Condition-based production is one type of CBP because, for manufacturing systems, the production rate indicates the performance level. They control the degradation process by condition-based production to balance output and failure risk. Uit het Broek et al. [19] further investigated the joint optimization of CBM and condition-based production for single-element systems. CBMP policies for single-element systems are also studied by Zhao et al. [20].

However, the influence of inter-element dependence [21] on CBMP cannot be investigated through single-element systems. Olde Keizer et al. [21] conclude four types of dependence: structural, stochastic, economic, and resource dependence. The unique structural characteristics of LMCCSs are one form of structural dependence. The load-sharing, the fixed maintenance setup cost, and the limited maintenance capacity are related to stochastic, economic, and resource dependence, respectively. Uit het Broek et al. [12] researched the joint optimization of CBM and condition-based load-sharing for economically dependent two-element systems. Our contributions to CBMP include introducing the resource dependence and LMCCS and then discussing how the four types of dependence affect the optimal CBMP policy.

Concerning resource dependence, scholars have studied restrictions on human resources [22], spare parts inventory [8,11,23,24,25], and transportation [26]. However, to our knowledge, few studies on CBMP consider resource dependence. We assume that maintenance activities are carried out in the form of replacement. We adopt the term maintenance capacity to denote the number of elements that can be replaced simultaneously. Although a high capacity reduces system failure risk, it increases costs associated with, for example, high spare inventory, numerous maintenance workers, and considerable transportation fees. In our study, the maintenance capacity is restricted and predetermined.

Age-based and time-based maintenance make decisions according to system age and elapsed time, respectively [9,27]. One of the strengths of CBM is that it makes decisions based on the actual system state obtained by, for instance, sensors and inspections [28,29,30]. For extensive reviews on CBM, we refer to De Jonge and Scarf [31] and Alaswad and Xiang [32]. Regarding specific CBM policies, the control limit, a threshold for maintenance initiation, is widely applied in single-element systems [20]. Once the degradation exceeds the control limit, maintenance is performed. The control limit policy is also effective for some multi-element systems [33]. However, in our study, the maintenance priority of the degraded but not failed elements can be higher than that of the failed ones because of factors including a restricted maintenance capacity, a high element failure cost, and the redundancy of LMCCSs. The reverse balance indicates that the control limit policy does not apply to our model.

Then, we review the literature on reassignment [34,35,36,37], another way of managing degradation. Elements are subject to various environmental conditions and loads that affect their degradation rates, so the degradation of elements can be managed by reassigning them to suitable positions. However, the reassignment, for example, exchanging the positions of two elements, incurs costs concerning transportation and installation [38]. Therefore, the reassignment does not fit some industrial systems, and we only adopt maintenance and performance control for the LMCCS.

The balance of degradation between elements has been well studied in the literature on reassignment. One of its typical policies is to swap positions so that more-degraded elements work with low environmental stress while less-degraded ones work in harsh environments [34,35,36,37]. The balance can cluster maintenance activities for economies of scale and avoid premature failure of highly degraded elements. The application of maintenance thresholds, such as the control limit, can also be seen as a balance of degradation. According to our numerical results, we take advantage of the balance and adopt the reverse balance to achieve more cost savings. Uit het Broek et al. [12] pointed out that the reverse balance is effective in economically dependent two-element systems with load-sharing. We find that the effectiveness of the balance and the reverse balance is associated with factors including the element failure cost and four types of dependence, structural, stochastic, resource, and economic dependence.

2.3. Review of Methodology

Finally, we review the methodology. The gamma process, the inverse Gaussian process, and the Wiener process are widely applied to model continuous degradation [39]. We adopt the gamma process, which is appropriate for describing degradation in the form of wear and cumulative damage [40]. To lower maintenance-related costs, the numerical analysis based on renewal theory is a significant methodology for systems under continuous degradation [31,32]. However, for the CBMP optimization of LMCCSs, identifying and evaluating renewal cycles is difficult. We adopt a commonly used alternative to discretize the continuous degradation level of each element into discrete degradation states [12,23,25,31,32]. Therefore, we can model the problem by the MDP and solve it through a dynamic programming algorithm.

3. Problem Description

Our base system is an LMCCS, consisting of

N + 1, N \in ℕ^{+}

, sequentially ordered nodes

B_{i}

,

i = 1, \dots, N + 1

, as shown in Figure 1. Each node, except the last one, has an element to connect the current node and subsequent nodes. All the

N

elements are functionally identical and degrade continuously over time. The degradation rate of an element increases when its performance level increases. The performance level of an element indicates how many subsequent nodes it can connect and is controllable if the element is not failed. Let

u_{i}

,

u_{i} \in {0, 1, \dots, k}

,

k \in ℕ^{+}

, denote the performance level of the

i

th element,

i = 1, 2, \dots, N

, where level

0

means the element provides no connection and

k

is the maximum level. As shown in Figure 1, the

i

th element at node

B_{i}

at performance level

u_{i}

can provide connections from

B_{i}

to

B_{\min {i + 1, N + 1}}, \dots, B_{\min {i + u_{i}, N + 1}}

. If the first node

B_{1}

and the last one

B_{N + 1}

are not connected, the LMCCS fails.

In the remainder of the section, we first introduce the discretization of the continuous degradation level and then the degradation process of the elements in Section 3.1. We elaborate on the CBMP policy and related costs for the LMCCS in Section 3.2.

3.1. Degradation of the Elements

The degradation level of an element ranges from

0

to infinity, where level

0

indicates that the element is brand-new, and an element fails when its degradation level reaches or exceeds a predetermined failure threshold

L

. To make the CBMP policy optimization tractable using the MDP, we discretize the continuous degradation level into

(D + 1)

discrete degradation states as

{0, 1, \dots, D}, D \in ℕ^{+}

. Let

l = L / D

. The degradation level in

[(x - 0.5) l, (x + 0.5) l)

corresponds to the discrete degradation state

x

,

x = 1, 2, \dots, D - 1

, the level in

[0, 0.5 l)

corresponds to the brand-new state

x = 0

, and the level in

[(D - 0.5) l, + \infty)

corresponds to the failure state

x = D

.

Let

x_{i}

denote the degradation state of the

i

th element,

i = 1, 2, \dots, N

,

x_{i} \in {0, 1, 2, \dots, D}

. Then, the system degradation state is

x = (x_{1}, x_{2}, \dots, x_{N})

. Starting from

x

, let

x^{d} = (x_{1}^{d}, x_{2}^{d}, \dots, x_{N}^{d})

denote the system state after a time unit,

x_{i}^{d} \in {0, 1, 2, \dots, D}

for

i = 1, 2, \dots, N

. The probability of degrading from state

x_{i}

to

x_{i}^{d}

of the

i

th element under performance

u_{i}

in a time unit is denoted as

P_{u_{i}} (x_{i}, x_{i}^{d})

. The system performance level is denoted as

u = (u_{1}, u_{2}, \dots, u_{N})

.

The degradation of elements is subject to load-sharing, and there are three considerations for the condition-based load-sharing decision. First, the load of an element in an LMCCS can only be shared by several elements before it, and only a functional element can take the load. Second, the load shift is not merely triggered by element failure, e.g., operators can switch off a functional element based on the system degradation state. Last, when the maximal performance level

k

is greater than 2, multiple elements can share the load of a not-functioning element, so it has to be decided which element will bear the load. To summarize, the condition-based load-sharing should consider which element(s) do not take the load, which element(s) bear the load, and the structural characteristics of LMCCSs. To model the load-sharing, we convert it into a performance control problem. We assume that, at a given performance level, the degradation process of an element is independent of the degradation processes of other elements. Therefore, the probability of degrading from system state

x

to

x^{d}

under system performance

u

in a time unit is

P_{u} (x, x^{d}) = \prod_{i = 1}^{N} P_{u_{i}} (x_{i}, x_{i}^{d})

.

To model the degradation of each element, we employ the gamma process, which is appropriate for depicting continuous degradation in cumulative damage such as wear, erosion, and fatigue [39,40]. We model the degradation level of the

i

th element as a gamma process

G_{i}

with the shape parameter

γ

and the scale parameter

λ (u_{i})

. The increment of the degradation level in a time unit, denoted by

Δ G_{i}

, has the probability density function as

f_{Δ G_{i}} (z; γ, λ (u_{i})) = \frac{λ {(u_{i})}^{γ} z^{γ - 1} \exp (- λ (u_{i}) z)}{Γ (γ)}

(1)

where the gamma function

Γ (a) = \int_{z = 0}^{\infty} z^{a - 1} e^{- z} d z

. Then, the expected degradation increment per time unit, i.e., the degradation rate, of the

i

th element is

γ λ (u_{i})

. Since the degradation rate increases with the performance level,

λ (u_{i})

is an increasing function of

u_{i}

. The gamma process is a monotonic degradation process, where an element can only remain in its current state or degrade to a worse state.

The probability that an element degrades from state

x_{i}

to

x_{i}^{d}

under performace

u_{i}

in a time unit for

x_{i} = 0, 1, \dots, D

is

P_{u_{i}} (x_{i}, x_{i}^{d}) = {\begin{array}{l} 0, x_{i}^{d} = 0, 1, \dots, x_{i} - 1 \\ \Pr {(x_{i}^{d} - x_{i} - 0.5) l \leq Δ G_{i} < (x_{i}^{d} - x_{i} + 0.5) l}, x_{i}^{d} = x_{i}, x_{i} + 1, \dots, D - 1 \\ \Pr {(x_{i}^{d} - x_{i} - 0.5) l \leq Δ G_{i}}, x_{i}^{d} = D \end{array}

(2)

3.2. CBMP Policy and Related Costs

Inspection, condition-based element replacement, and condition-based performance control are successively executed at the start of every time unit. The time required for these activities is assumed to be negligible. The inspection obtains the system degradation state

x = (x_{1}, x_{2}, \dots, x_{N})

, incurring cost

c_{i n s}

. Specialized equipment and maintenance staff need to be prepared if the maintenance is scheduled, incurring setup cost

c_{s e t}

. We assume that the replacement restores an element to its brand-new state. The residual value of a failed element is smaller than that of a degraded but not failed one. Therefore, the cost of replacing a degraded but not failed element

c_{p m}

is less than that of a failed element

c_{c m}

, i.e.,

c_{p m} < c_{c m}

. The difference between the corrective and preventive replacement costs

(c_{c m} - c_{p m})

can be seen as an element failure cost, not incurred upon element failure but reflected when performing the replacement. Maintenance capacity

M_{c a p a}

refers to the maximum number of elements that can be replaced simultaneously at the beginning of each unit of time.

Performance control is performed after the replacement. The performance level of an element can be set as an integer from

0

to

k

if the element is functional, i.e., in degradation states from

0

to

D - 1

. For a failed element, i.e., the one in state

D

, its performance level is not controllable and can only be set as

0

. The system failure cost

c_{f a i l}

is incurred if the system performance level

u = (u_{1}, u_{2}, \dots, u_{N})

cannot enable the first node

B_{1}

and the last node

B_{N + 1}

to be connected. The degradation rates of the elements in the unit of time are determined by their performance levels.

This paper aims to find an optimal CBMP policy for LMCCSs, i.e., to determine which elements are to be replaced and how to control performance levels based on the system degradation state. The CBMP decision is made at the beginning of every unit of time to minimize the long-term total cost, including the inspection, setup, replacement, and system failure costs.

4. Markov Decision Process Formulation

The MDP model includes a set of possible states

S

. Every possible state

x \in S

has its own set of feasible actions

A (x)

. The model also contains the transition probabilities that the system degrades from state

x

to state

x^{d}

if an action in

A (x)

is chosen. When an LMCCS is in state

x

and under an action in

A (x)

, a corresponding cost is incurred. The value function

v (x)

reflects the effect of actions on long-term maintenance-related costs. In addition, we present the dynamic programming algorithm, specifically the policy iteration algorithm, to solve the model, along with its pseudocode.

4.1. State Space, Action Space, and Transition Probabilities

State space. The state of the LMCCS is measured by the degradation state

x = (x_{1}, x_{2}, \dots, x_{N})

. The state space

S = {(x_{1}, x_{2}, \dots, x_{N})} Δ

, where

x_{i} \in {0, 1, \dots, D}

for

i = 1, 2, \dots, N

.

Action space. At the beginning of each time unit, two sequential actions are executed. First is the replacement decision, denoted as

δ

. Then, the performance control sets the system performance to

u

. As a result, the action space is determined as follows:

A (x) = {δ, u}, δ = (δ_{1}, δ_{2}, \dots, δ_{N}), u = (u_{1}, u_{2}, \dots, u_{N})

. The maintenance capacity

M_{c a p a}

limits the number of replacements. For

i = 1, 2, \dots, N

:

δ_{i} = {\begin{array}{l} 1, if the i th element is replaced, \\ 0, otherwise . \end{array}, \sum_{i = 1}^{N} δ_{i} \leq M_{c a p a}

(3)

An LMCCS’s degradation state after replacement decision

δ

is denoted as

x^{m} (x, δ) = (x_{1}^{m} (x_{1}, δ_{1}), x_{2}^{m} (x_{2}, δ_{2}), \dots, x_{N}^{m} (x_{N}, δ_{N}))

. The replacement is assumed to restore an element to its brand-new state. Thus,

x_{i}^{m} (x_{i}, δ_{i}) = {\begin{matrix} x_{i}, if δ_{i} = 0, \\ 0, if δ_{i} = 1 . \end{matrix}

(4)

If, after replacement, the

i

th element is still in the failure state

D

, its performance level is not controllable and is fixed at

0

. Otherwise, its performance level can be set to any integer from

0

to

k

. Hence, for

i = 1, 2, \dots, N

:

{\begin{array}{l} u_{i} = 0, if x_{i}^{m} (x_{i}, δ_{i}) = D, \\ u_{i} \in {0, 1, \dots, k}, otherwise . \end{array}

(5)

Transition probabilities. The degradation processes of the elements are independent when the system performance level is fixed. In detail, the degradation of each element follows the gamma process. The transition probability of moving from

x

to

x^{d}

under replacement

δ

and performance control

u

is

P_{u} (x^{m} (x, δ), x^{d}) = \prod_{i = 1}^{N} P_{u_{i}} (x_{i}^{m} (x_{i}, δ_{i}), x_{i}^{d})

. The method of computing the probability

P_{u_{i}} (x_{i}^{m} (x_{i}, δ_{i}), x_{i}^{d})

has been given in Equations (1) and (2) in Section 3.1.

4.2. Value Function Formulation

We consider inspection, setup, element replacement, and system failure costs for each time unit. The replacement of a failed element incurs the corrective maintenance cost

c_{c m}

, while that of a functional but degraded element incurs the preventive maintenance cost

c_{p m}

. The replacement cost of the

i

th element is denoted as

c_{r e p} (x_{i})

and

c_{r e p} (x_{i}) = {\begin{array}{l} c_{c m}, if x_{i} = D, \\ c_{p m}, otherwise . \end{array}

Therefore, the replacement cost of the LMCCS is

\sum_{i = 1}^{N} δ_{i} * c_{r e p} (x_{i})

.

The indicator function

I {χ}

equals 1 if condition

χ

is true and 0 otherwise. A setup cost

c_{s e t}

is required if the replacement is initiated, i.e.,

\sum_{i = 1}^{N} δ_{i} > 0

. Hence, the system maintenance setup cost is

I {\sum_{i = 1}^{N} δ_{i} > 0} * c_{s e t}

.

The method to determine whether a system failure cost

c_{f a i l}

is incurred is as follows. Node

B_{i}

,

i \in {2, 3, \dots, N + 1}

, is connected by at least one element before

B_{i}

if

\sum_{j = 1}^{k} I {u_{\max {1, i - j}} \geq j} \geq 1

. If every node

B_{i}

,

i \in {2, 3, \dots, N + 1}

, is connected by at least one element before

B_{i}

, i.e.,

\sum_{i = 2}^{N + 1} I {\sum_{j = 1}^{k} I {u_{\max {1, i - j}} \geq j} \geq 1} = N

, the LMCCS works correctly. Thus, the system failure cost of an LMCCS in system performance

u = (u_{1}, u_{2}, \dots, u_{N})

is

I {\sum_{i = 2}^{N + 1} I {\sum_{j = 1}^{k} I {u_{\max {1, i - j}} \geq j} \geq 1} < N} * c_{f a i l}

.

The sum of costs

c (x, δ, u)

in an LMCCS for one unit of time comprises the inspection cost

c_{i n s}

, the replacement cost

\sum_{i = 1}^{N} δ_{i} * c_{r e p} (x_{i})

, the setup cost

I {\sum_{i = 1}^{N} δ_{i} > 0} * c_{s e t}

, and the system failure cost. Hence,

c (x, δ, u) = c_{i n s} + \sum_{i = 1}^{N} δ_{i} * c_{r e p} (x_{i}) + I {\sum_{i = 1}^{N} δ_{i} > 0} * c_{s e t} + I {\sum_{i = 2}^{N + 1} I {\sum_{j = 1}^{k} I {u_{\max {1, i - j}} \geq j} \geq 1} < N} * c_{f a i l}

(6)

We use

X^{d} (x^{m} (x, δ)) = {(x_{1}^{d}, x_{2}^{d}, \dots, x_{N}^{d}) | x_{i}^{m} (x_{i}, δ_{i}) \leq x_{i}^{d} \leq D}

to represent the set of all degradation states that can be reached from state

x

when the replacement decision is

δ

. An element whose degradation follows the gamma process can only remain in its current state or degrade to a worse state. Hence,

x_{i}^{m} (x_{i}, δ_{i}) \leq x_{i}^{d} \leq D

.

Let

β

,

0 < β < 1

, be the discount rate, which decides the present value of future costs. A cost occurring

g, g \in ℕ^{+}

, units of time in the future equals only

β^{g}

times what it would be if it happened instantly.

A value function specifies what is costly in the long run, whereas the abovementioned cost

c (x, δ, u)

indicates what is costly in a time unit. Therefore, the value function

v (x)

first considers the immediate cost

c (x, δ, u)

, then the set of possible following states

X^{d} (x^{m} (x, δ))

and finally, the expected potential costs in those states

\sum_{x^{d} \in X^{d} (x^{m} (x, δ))} P_{u} (x^{m} (x, δ), x^{d}) * v (x^{d})

. These potential costs in the future should be multiplied by the discount rate

β

. Hence, the minimized value function

v (x)

is as follows:

v (x) = \min_{δ, u \in A (x)} {c (x, δ, u) + \sum_{x^{d} \in X^{d} (x^{m} (x, δ))} P_{u} (x^{m} (x, δ), x^{d}) * v (x^{d}) * β}

(7)

where the constraints on the action space

A (x)

have been given by Equations (3)–(5) in Section 4.1.

4.3. Policy Iteration Algorithm

We adopt the policy iteration to determine the optimal action

(δ, u)

for every

x \in S

. Algorithm 1 demonstrates the pseudocode. Due to constraints in Equations (3)–(5), the action space

A (x)

of each

x \in S

can be different and thus needs to be determined in Step 0, preparation. We start with an arbitrary policy, as shown in Step 1, initialization. The value function of that policy is found in Step 2, policy evaluation. Specifically, for each system state

x \in S

, the effect of the current policy on its value function

v (x)

is evaluated iteratively by computing the objective function in the minimization problem in Equation (7). Then, based on the updated value function, a new and improved policy is reached in Step 3, policy improvement. In detail, every feasible action

(δ, u) \in A (x)

of each

x \in S

is evaluated to reach the optimal policy of the current value function. After that, repeat the policy evaluation and the policy improvement steps until the current optimal actions of every state keep stable, which means the optimal solution of the model has been achieved.

Algorithm 1: Policy iteration algorithm.

Input: Element number

N

; failure threshold

L

; failure state

D

; maximal performance level

k

; maintenance capacity

M_{c a p a}

; costs

c_{i n s}

,

c_{c m}

,

c_{p m}

,

c_{s e t}

,

c_{f a i l}

; shape parameter

γ

of the gamma process; scale parameter

λ (u_{i})

for

u_{i} \in {0, 1, \dots, k}

; discount rate

β

; termination tolerance

ε

.
Step 0: (Preparation) Determine the action space

A (x)

for all

x \in S

according to Equations (3)–(5). Compute the transition probabilities

P_{u} (x, x^{d})

according to Equations (1) and (2) for all

x \in S

,

x^{d} \in X^{d} (x)

, and

u \in A (x)

. Compute the sum of costs

c (x, δ, u)

for all

x \in S

and (δ, u) \in A (x)

according to Equation (6).
Step 1: (Initialization) The maintenance and the performance control policy for the LMCCS in state

x

is denoted as

π_{δ} (x)

and

π_{u} (x)

, respectively.

π_{δ} (x)

and

π_{u} (x)

are vectors of length

N

. Let

v (x) = 0, π_{δ} (x) = 0, π_{u} (x) = 0

for all x \in S

.
Step 2: Policy Evaluation
Repeat:

Δ = 0

(The symbol

Δ

denotes the difference in the value function between two successive iterations)

For x \in S

:

o l d_{v} = v (x)

(Let o l d_v

denote a copy of the value function)

δ = π_{δ} (x), u = π_{u} (x)

v (x) = c (x, δ, u) + \sum_{x^{d} \in X^{d} (x^{m} (x, δ))} P_{u} (x^{m} (x, δ), x^{d}) * v (x^{d}) * β

Δ = \max {Δ, | o l d_v - v (x) |}

Until

Δ < ε

(The symbol

ε

is the termination tolerance)
Step 3: Policy Improvement

P o l i c y S t a b l e = T R U E

(Let

P o l i c y S t a b l e

indicate whether policy

π_{δ} (x)

and π_{u} (x)

keep stable)

For x \in S

:

o l d_{π} = (π_{δ} (x), π_{u} (x))

(Let o l d_π

denote a copy of the current policy)

(π_{δ} (x), π_{u} (x)) = {argmin}_{(δ, u) \in A (x)} {c (x, δ, u) + \sum_{x^{d} \in X^{d} (x^{m} (x, δ))} P_{u} (x^{m} (x, δ), x^{d}) * v (x^{d}) * β}

(

{argmin}_{z} {f (z)}

means the value of

z

at which f (z)

takes its minimal value)
If

o l d_π \neq (π_{δ} (x), π_{u} (x))

, then P o l i c y S t a b l e = F A L S E

.
If

P o l i c y S t a b l e

,

then stop and return v (x), π_{δ} (x), π_{u} (x)

for all x \in S

; else go back to Step 2 (Policy Evaluation).

5. Numerical Experiments

In this section, we employ numerical experiments to explore the CBMP policy. Section 5.1 elaborates on the main experiment, which focuses on the reverse balance and the balance in degradation management. Section 5.2 introduces a benchmark policy that considers CBM but predetermines a load-sharing rule. The CBMP policy compares favorably with the benchmark because the CBMP policy offers different load-sharing arrangements according to different system degradation states. Section 5.3 performs the sensitivity analysis to observe how the change in several parameter values influences the optimal CBMP policy. The whole solution process is implemented in Python 3.10 (64-bit) on an Intel^® Core™ i7-1165G7 @ 2.80 GHz CPU and 16 GB 3733 MHz memory.

5.1. Main Experiment

In the main experiment, we discuss one of our contributions, i.e., explorations of the reverse balance of degradation. Section 5.1.1 introduces the parameters of the model and assumptions concerning the performance–degradation relation, for which the degradation rate of an element is a function of its performance level. Section 5.1.2 demonstrates examples of three categories of the reverse balance and one category of the balance of degradation.

5.1.1. Parameters of the Main Experiment

We consider an LMCCS comprising five elements. Each element has four degradation states: state 0 means not degraded, state 1 means moderately degraded, state 2 means highly degraded, and state 3 means failed. Hence, there are a total of 1024 possible system states. The maximal performance level

k

of each element is

2

. Level 0 means idle, level 1 indicates that the element can connect to the next node, and level 2 indicates connecting to the following two nodes. The maintenance capacity

M_{c a p a}

is 2, which means two elements at most can be replaced in a decision period. The inspection, maintenance setup, preventive maintenance, corrective maintenance, and system failure costs are 5, 100, 20, 150, and 5000, respectively. The discount rate and the termination tolerance for the policy iteration algorithm are 0.97 and

0.00001

, respectively.

As mentioned in Section 3.1, the degradation of each element follows the gamma process, so the degradation rate of the

i

th element is the product of the shape and scale parameters

γ λ (u_{i})

. The relation between the performance level

u_{i}

and the degradation rate

γ λ (u_{i})

can be modeled as, for example, power, exponential, logarithmic functions, or certain combinations among these functions. We model the performance–degradation relation as follows:

γ λ (u_{i}) = γ λ (0) + (γ λ (k) - γ λ (0)) {(u_{i} / k)}^{1.1}

(8)

where the shape parameter

γ = 2.25

, the degradation rate of an idle element

γ λ (0) = 0.15

, and that of an element at the maximal performance level

γ λ (k) = 1.20

. We assume that an element still gradually degrades while not working. In addition, we set the exponent of

{(u_{i} / k)}^{1.1}

as

1.1

so that the degradation rate rises faster and faster when the performance level increases. An exponent greater than one leads to a result that

λ (0) + λ (2) > 2 λ (1)

, which encourages sharing the total system load equally among functional elements. Therefore, the degradation rate

γ λ (1) = 0.64

for an element at performance level 1. The parameter values of the main experiment are summarized in Table 1.

5.1.2. Reverse Balance and Balance in Degradation Management

The optimal replacement and performance control of the 1024 system states can be reached through the policy iteration algorithm provided in Section 4.3. Some of the results are presented in Table 2, where the system state before replacement (original state), the optimal replacement action, the system state after replacement (post-replace state), the optimal performance control, and the value function are shown. Based on the results, we discuss one category of the reverse balance resulting from replacement, two categories of the reverse balance caused by performance control, and one category of the balance by performance control.

Instances I1–I6 in group G1 (see Table 2) illustrate the reverse balance resulting from replacement. The failed elements, i.e., the elements with degradation state 3, are not prioritized for replacement, while highly degraded elements, i.e., the elements with degradation state 2, are replaced first. It differs from the control limit policy under which maintenance occurs once the degradation level exceeds a threshold. The control limit policy is intuitive since the degradation level surpassing the control limit implies an underlying risk of system failure. The reason to avoid system failure is that system failure or downtime costs are high for some systems in real life. Similarly, regarding specific multi-element systems, an element failure cost exists, reflected by a difference between preventive and corrective replacement costs in our model. The element failure cost is realistic because irreparable damage is possible upon element failure, reducing the residual value of a replaced element. Therefore, when the maintenance capacity is restricted, a trade-off has to be reached between replacing failed and highly degraded elements. Giving priority to failed elements better strengthens the system reliability while replacing highly degraded ones first can avoid element failure costs. Hence, a reverse balance may occur in a redundant system with a relatively low system failure cost, a high element failure cost, and a limited maintenance capacity. Such a reverse balance can be observed in redundant systems other than LMCCSs, e.g., parallel systems and variants of k-out-of-n systems [8,9,10,41].

Instances I7–I14 in group G2 (see Table 2) display the reverse balance arising from performance control, where a more-degraded element takes more load than a less-degraded one, i.e., the performance level of a more-degraded element is set higher than that of less-degraded ones. For example, in instance I7 group G2, when the post-replace state is (0,0,0,1,2), the performance level is (1,1,1,2,0), where the performance level of the element with degradation state 0 is 1 while the performance level of the element with degradation state 1 is 2. The following three factors contribute to one type of reverse balance caused by performance control. First, the LMCCS’s redundancy allows a functional element to be switched off, which brings down the degradation rate of the element. Second, switching off highly degraded elements can be cost-effective when the element failure cost is high. Third, the unique structural characteristics of the LMCCS constrain the load-sharing arrangement. The load of an element in the LMCCS can only be taken by several elements before it. Therefore, if not properly located, the least-degraded element cannot take the load from a highly degraded element, which results in a reverse balance. The abovementioned reverse balance can be observed in instances I7–I10, group G2, Table 2. Another type of reverse balance can arise if the structural characteristics of the LMCCS prevent the least-degraded elements from taking on more load when there is a failed element. Instances I11–I14 in group G2 in Table 2 demonstrate this type of reverse balance.

A balance of degradation may simultaneously occur with the reverse balance mentioned above. For example, the elements with degradation state 2 are actively switched off, and their loads are taken by elements with degradation state 0 or 1 in instances I7–I9, group G2, Table 2. Using a less-degraded element to protect a more-degraded element from failure through load-sharing (performance control) is an active balance of degradation. In addition, the active switch-off of a functional element based on system degradation states can be regarded as a condition-based mission abort [42], where an element’s work is stopped to ensure its survivability. Our merit is considering load-sharing and redundancy so that both the system’s correct functioning and the element survivability are achieved.

The maintenance setup cost and the maintenance capacity influence the occurrence of the balances and reverse balances. A high setup cost can delay the replacement of degraded elements, and a low maintenance capacity restricts the number of replacements, both of which can lead to a poor system state. It is realistic that various restrictions on maintenance capacity exist in real life, and the setup cost is expensive for specific systems. However, the active balance and reverse balances discussed in this section stem from coping with systems with highly degraded or failed elements, and the cost caused by unfavorable factors, e.g., low maintenance capacity and high setup costs, can be reduced by the condition-based maintenance and performance control.

5.2. Comparison Experiment

This section first defines a benchmark policy and then presents the results of the comparison experiment, where how and why the CBMP outperforms the benchmark is also analyzed.

The benchmark policy consists of predetermined performance control and CBM. The predetermined performance control is introduced as follows. The system load is equally distributed among all system elements if they are all functional. When an element fails, its load is taken by its closest functional element before it. If no functional element meets such requirements, the LMCCS fails, and the remaining functional elements are switched off. Whether the system fails or not, we assume an element degrades according to its performance level. The CBM decision is made considering the predetermined performance control. From the perspective of MDP modeling, the action space of the benchmark is a proper subset of the CBMP problem’s action space. It means that the CBMP problem has more alternatives in making performance control and replacement decisions, which guarantees that the optimal CBMP policy is better than or at least equal to the optimal benchmark policy.

According to the results of the experiments, the mean value of the value function under CBMP is 4366.71, 6.54% smaller than the one under the benchmark policy, which is 4672.32. Meanwhile, for every state, the value function of CBMP is smaller than that of the benchmark policy. Under the two policies, 342 system states have different optimal actions, accounting for 33.40% of all 1024 states. The experiment results and the above theoretical analysis consistently indicate that CBMP outperforms the benchmark. Next, we examine the advantages of CBMP over the benchmark.

The CBMP policy allows for actively switching off a functional element, which raises the average degradation rate of system elements, according to Equation (8). Moreover, switching off a functional element may cause system failure when certain other elements have already failed. Therefore, the performance control should be condition-based and optimized so that the cost savings brought by switching off a functional element outweigh its side effect. Specifically, through an active balance of degradation, the load of a highly degraded element is taken by less-degraded ones, averting element failure, which also strengthens the system reliability. Table 3 shows some of the results of CBMP and the benchmark policy, where the first row of each instance is the result of CBMP, while the second row in gray shading is the result of the benchmark policy. As shown in instances I15–I19, group G3 in Table 3, because the system reliability is enhanced by the active performance control (load-sharing), replacements are delayed in the CBMP policy compared with the benchmark. Replacements being postponed means that the remaining useful life of degraded elements can be fully exploited. In addition, the postponed replacements and the balanced degradation levels can cluster replacements to achieve economies of scale.

5.3. Sensitivity Analysis

We discussed the balance and the reverse balance in the main experiment. In the comparison experiment, the advantages of the CBMP policy over the benchmark stem from the active balance by condition-based performance control. Several groups of sensitivity analysis are conducted to better investigate the impacts of uncertainties in the model parameters on the optimal CBMP, including the balance and the reverse balance. Based on the parameters in Table 1, we investigate how the maintenance capacity, the setup cost, and the difference between preventive and corrective replacement costs affect the optimal results.

5.3.1. Sensitivity Analysis concerning Maintenance Capacity

In the main experiment, the maintenance capacity

M_{c a p a}

is set as 2 so that for a system with multiple degraded elements, the post-replacement system state is still degraded. Since the balance and the reverse balance discussed in Section 5.1.2 arise from coping with a degraded system state, if the maintenance capacity is not restricted, we may not be able to continue to observe such phenomena. Table 4 shows the results with a different value of maintenance capacity, in which the first row of each instance is the result of the main experiment, where

M_{c a p a} = 2

; while the second row in gray shading results from changing the

M_{c a p a}

to 5. In instances I20–I22 in group G4 (see Table 4), when

M_{c a p a} = 2

, highly degraded elements are replaced first rather than the failed ones, and reverse balances regarding performance control can be observed in I23–I25, G5, Table 4. However, when the capacity is lifted to 5, i.e., no limit on the number of replacements exists, the elements in I20–I25 are all replaced so that the systems are new and no balance nor reverse balance discussed in Section 5.1.2 occurs.

5.3.2. Sensitivity Analysis concerning Maintenance Setup Costs

In the main experiment, the setup cost is relatively high compared with the preventive and corrective replacement costs (

c_{s e t} = 100

,

c_{p m} = 20

,

c_{c m} = 150

), which can postpone the replacement and lead to a degraded system state. Table 5 shows the results with a different value of the setup cost, in which the first row of each instance is the result of the main experiment, where

c_{s e t} = 100

; while the second row in gray shading results from changing the

c_{s e t}

to 20. When

c_{s e t} = 100

, the active balance by performance control can be seen in I26–I29, Table 5, and reverse balances by performance control can be observed in I28–I31, where instances I28–I29 in G7 concern protecting highly degraded elements, and instances I30–I31 in G8 are about load-sharing responding to element failure. However, when the setup cost is decreased from 100 to 20, in I26–I31, the highly degraded and failed elements are replaced, preventing the balance and the reverse balance discussed in Section 5.1.2 from happening.

5.3.3. Sensitivity Analysis concerning the Difference between Preventive and Corrective Replacement Costs

As mentioned in Section 5.1.2, the reverse balance by replacement and the active balance by performance control arise from reducing the failure risk of highly degraded elements so that the expected element failure cost, reflected by the difference between preventive and corrective replacement costs, is decreased. Table 6 shows the results with a different value of the element failure cost, in which the first row of each instance is the result of the main experiment, where

c_{p m} = 20

,

c_{c m} = 150

; while the second row in gray shading results from

c_{p m} = 20

,

c_{c m} = 80

. When the difference between preventive and corrective replacement costs decreased from 130 (

c_{p m} = 20

,

c_{c m} = 150

) to 60 (

c_{p m} = 20

,

c_{c m} = 80

), the balances and reverse balances can still be observed, as presented in instances I32–I37, groups G9–G11, Table 6. Group G9 is about the reverse balance by replacement, group G10 relates to the balance and the reverse balance by performance control, and group G11 concerns the reverse balance by performance control arising from element failure.

6. Conclusions

We propose an optimal CBMP policy for LMCCSs to minimize long-term maintenance-related costs, including system failure costs. Specifically, optimal maintenance and performance control for every system degradation state is reached through MDP modeling and the policy iteration algorithm. In addition, we model the element degradation by the gamma process and discretize the continuous degradation process so that it can be modeled by the MDP. Our model and the algorithm also apply to linear consecutive-k-out-of-n:F systems because the LMCCS generalizes such systems. The optimal condition-based load-sharing for LMCCSs is also covered in the CBMP optimization. We have examined five critical factors influencing the optimal CBMP policy: load-sharing, structural characteristics of the LMCCS, maintenance setup cost, limited maintenance capacity, and the difference between preventive and corrective replacement costs. In numerical experiments, we analyze reverse balances resulting from replacement and performance control and the active balance through performance control. The reverse balance by replacement indicates that condition-based maintenance policies regarding a maintenance threshold, e.g., a control limit, may not always be optimal.

We suggest three directions for future research. First, the maintenance capacity is predetermined in our study, while it can be dynamic for future studies. Spare inventory replenishment, transportation, budget, and other concrete constraints on maintenance resources can be considered. Second, a node can hold more than one element in some LMCCSs, complicating the condition-based load-sharing decision. Researchers can consider the CBMP policy for other variants of the LMCCS. Last, real-world systems may pose challenges for collecting and processing information on the system degradation state. Making maintenance and performance control decisions under partial or inaccurate information is worth studying.

Author Contributions

Conceptualization, J.W. and Y.W.; funding acquisition, J.W. and Y.F.; methodology, J.W. and Y.W.; software, Y.W.; supervision, Y.F.; validation, Y.F.; writing—original draft, J.W. and Y.W.; writing—review and editing, J.W. and Y.F. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China under grants #72271226, #72271049, and #72101025, the Beijing Foreign Studies University ‘Double First Class’ Major Landmark Project under grant #2022SYLZD001, and the Fundamental Research Funds for the Central Universities under grant #2023JJ016.

Data Availability Statement

Data sharing is not applicable to this paper since no data set is used in the current study.

Acknowledgments

The authors express their gratitude to the referees for providing valuable suggestions and comments that helped improve the paper. Jun Wang and Yuyang Wang are thankful to the International Business School, Beijing Foreign Studies University. Yuqiang Fu is thankful to the School of Mathematics and Physics, University of Science and Technology Beijing.

Conflicts of Interest

The authors declare no conflict of interest.

References

Hwang, F.K.; Yao, Y.C. Multistate Consecutively-Connected Systems. IEEE Trans. Reliab. 1989, 38, 472–474. [Google Scholar] [CrossRef]
Zuo, M.J.; Liang, M. Reliability of Multistate Consecutively-Connected Systems. Reliab. Eng. Syst. Saf. 1994, 44, 173–176. [Google Scholar] [CrossRef]
Kuo, W.; Zhang, W.; Zuo, M. A Consecutive-k-out-of-n:G System: The Mirror Image of a Consecutive-k-out-of-n:F System. IEEE Trans. Reliab. 1990, 39, 244–253. [Google Scholar] [CrossRef]
Peng, R.; Xie, M.; Ng, S.H.; Levitin, G. Element Maintenance and Allocation for Linear Consecutively Connected Systems. IIE Trans. 2012, 44, 964–973. [Google Scholar] [CrossRef]
Yu, H.; Yang, J.; Peng, R.; Zhao, Y. Reliability Evaluation of Linear Multi-State Consecutively-Connected Systems Constrained by m Consecutive and n Total Gaps. Reliab. Eng. Syst. Saf. 2016, 150, 35–43. [Google Scholar] [CrossRef]
Endharta, A.J.; Yun, W.Y. Condition-Based Maintenance Policy for Linear Consecutive-k-out-of-n:F System. Commun. Stat.—Simul. Comput. 2017, 46, 3088–3102. [Google Scholar] [CrossRef]
Olde Keizer, M.C.A.; Teunter, R.H.; Veldman, J.; Babai, M.Z. Condition-Based Maintenance for Systems with Economic Dependence and Load Sharing. Int. J. Prod. Econ. 2018, 195, 319–327. [Google Scholar] [CrossRef]
Bjarnason, E.T.S.; Taghipour, S.; Banjevic, D. Joint Optimal Inspection and Inventory for a K-out-of-n System. Reliab. Eng. Syst. Saf. 2014, 131, 203–215. [Google Scholar] [CrossRef]
Barron, Y. Group Maintenance Policies for an R-out-of-N System with Phase-Type Distribution. Ann. Oper. Res. 2018, 261, 79–105. [Google Scholar] [CrossRef]
Song, Y.; Wang, X. Reliability Analysis of the Multi-State k-out-of-n: F Systems with Multiple Operation Mechanisms. Mathematics 2022, 10, 4615. [Google Scholar] [CrossRef]
Salari, N.; Makis, V. Joint Maintenance and Just-in-Time Spare Parts Provisioning Policy for a Multi-Unit Production System. Ann. Oper. Res. 2020, 287, 351–377. [Google Scholar] [CrossRef]
Uit Het Broek, M.A.J.; Teunter, R.H.; de Jonge, B.; Veldman, J. Joint Condition-Based Maintenance and Load-Sharing Optimization for Two-Unit Systems with Economic Dependency. Eur. J. Oper. Res. 2021, 295, 1119–1131. [Google Scholar] [CrossRef]
Geurtsen, M.; Didden, J.B.H.C.; Adan, J.; Atan, Z.; Adan, I. Production, Maintenance and Resource Scheduling: A Review. Eur. J. Oper. Res. 2023, 305, 501–529. [Google Scholar] [CrossRef]
Aramon Bajestani, M.; Banjevic, D.; Beck, J.C. Integrated Maintenance Planning and Production Scheduling with Markovian Deteriorating Machine Conditions. Int. J. Prod. Res. 2014, 52, 7377–7400. [Google Scholar] [CrossRef]
Ghaleb, M.; Taghipour, S.; Zolfagharinia, H. Real-Time Integrated Production-Scheduling and Maintenance-Planning in a Flexible Job Shop with Machine Deterioration and Condition-Based Maintenance. J. Manuf. Syst. 2021, 61, 423–449. [Google Scholar] [CrossRef]
Shen, Y.; Zhang, X.; Shi, L. Joint Optimization of Production and Maintenance for a Serial–Parallel Hybrid Two-Stage Production System. Reliab. Eng. Syst. Saf. 2022, 226, 108600. [Google Scholar] [CrossRef]
An, Y.; Chen, X.; Hu, J.; Zhang, L.; Li, Y.; Jiang, J. Joint Optimization of Preventive Maintenance and Production Rescheduling with New Machine Insertion and Processing Speed Selection. Reliab. Eng. Syst. Saf. 2022, 220, 108269. [Google Scholar] [CrossRef]
uit het Broek, M.A.J.; Teunter, R.H.; de Jonge, B.; Veldman, J.; van Foreest, N.D. Condition-Based Production Planning: Adjusting Production Rates to Balance Output and Failure Risk. Manuf. Serv. Oper. Manag. 2020, 22, 792–811. [Google Scholar] [CrossRef]
uit het Broek, M.A.J.; Teunter, R.H.; de Jonge, B.; Veldman, J. Joint Condition-Based Maintenance and Condition-Based Production Optimization. Reliab. Eng. Syst. Saf. 2021, 214, 107743. [Google Scholar] [CrossRef]
Zhao, X.; He, Z.; Wu, Y.; Qiu, Q. Joint Optimization of Condition-Based Performance Control and Maintenance Policies for Mission-Critical Systems. Reliab. Eng. Syst. Saf. 2022, 226, 108655. [Google Scholar] [CrossRef]
Olde Keizer, M.C.A.; Flapper, S.D.P.; Teunter, R.H. Condition-Based Maintenance Policies for Systems with Multiple Dependent Components: A Review. Eur. J. Oper. Res. 2017, 261, 405–420. [Google Scholar] [CrossRef]
Liu, B.; Xu, Z.; Xie, M.; Kuo, W. A Value-Based Preventive Maintenance Policy for Multi-Component System with Continuously Degrading Components. Reliab. Eng. Syst. Saf. 2014, 132, 83–89. [Google Scholar] [CrossRef]
Olde Keizer, M.C.A.; Teunter, R.H.; Veldman, J. Joint Condition-Based Maintenance and Inventory Optimization for Systems with Multiple Components. Eur. J. Oper. Res. 2017, 257, 209–222. [Google Scholar] [CrossRef]
Wang, J.; Qiu, Q.; Wang, H. Joint Optimization of Condition-Based and Age-Based Replacement Policy and Inventory Policy for a Two-Unit Series System. Reliab. Eng. Syst. Saf. 2021, 205, 107251. [Google Scholar] [CrossRef]
Wang, J.; Zhu, X. Joint Optimization of Condition-Based Maintenance and Inventory Control for a k-out-of-n:F System of Multi-State Degrading Components. Eur. J. Oper. Res. 2021, 290, 514–529. [Google Scholar] [CrossRef]
Zhu, X.; Wang, J.; Coit, D.W. Joint Optimization of Spare Part Supply and Opportunistic Condition-Based Maintenance for Onshore Wind Farms Considering Maintenance Route. IEEE Trans. Eng. Manag. 2022, 216, 1–17. [Google Scholar] [CrossRef]
Bajestani, M.A.; Banjevic, D. Calendar-Based Age Replacement Policy with Dependent Renewal Cycles. IIE Trans. 2016, 48, 1016–1026. [Google Scholar] [CrossRef]
Wang, J.; Qiu, Q.; Wang, H.; Lin, C. Optimal Condition-Based Preventive Maintenance Policy for Balanced Systems. Reliab. Eng. Syst. Saf. 2021, 211, 107606. [Google Scholar] [CrossRef]
Chen, K.; Zhao, X.; Qiu, Q. Optimal Task Abort and Maintenance Policies Considering Time Redundancy. Mathematics 2022, 10, 1360. [Google Scholar] [CrossRef]
Li, S.; Wen, M.; Zu, T.; Kang, R. Condition-Based Maintenance Optimization Method Using Performance Margin. Axioms 2023, 12, 168. [Google Scholar] [CrossRef]
de Jonge, B.; Scarf, P.A. A Review on Maintenance Optimization. Eur. J. Oper. Res. 2020, 285, 805–824. [Google Scholar] [CrossRef]
Alaswad, S.; Xiang, Y. A Review on Condition-Based Maintenance Optimization Models for Stochastically Deteriorating System. Reliab. Eng. Syst. Saf. 2017, 157, 54–63. [Google Scholar] [CrossRef]
Sun, Q.; Ye, Z.-S.; Chen, N. Optimal Inspection and Replacement Policies for Multi-Unit Systems Subject to Degradation. IEEE Trans. Reliab. 2018, 67, 401–413. [Google Scholar] [CrossRef]
Fu, Y.; Yuan, T.; Zhu, X. Optimum Periodic Component Reallocation and System Replacement Maintenance. IEEE Trans. Reliab. 2019, 68, 753–763. [Google Scholar] [CrossRef]
Fu, Y.; Yuan, T.; Zhu, X. Importance-Measure Based Methods for Component Reassignment Problem of Degrading Components. Reliab. Eng. Syst. Saf. 2019, 190, 106501. [Google Scholar] [CrossRef]
Sun, Q.; Ye, Z.-S.; Zhu, X. Managing Component Degradation in Series Systems for Balancing Degradation through Reallocation and Maintenance. IISE Trans. 2020, 52, 797–810. [Google Scholar] [CrossRef]
Fu, Y.; Wang, J. Optimum Periodic Maintenance Policy of Repairable Multi-Component System with Component Reallocation and System Overhaul. Reliab. Eng. Syst. Saf. 2022, 219, 108224. [Google Scholar] [CrossRef]
Cai, Z.; Si, S.; Sun, S.; Li, C. Optimization of Linear Consecutive-k-out-of-n System with a Birnbaum Importance-Based Genetic Algorithm. Reliab. Eng. Syst. Saf. 2016, 152, 248–258. [Google Scholar] [CrossRef]
Ye, Z.-S.; Xie, M. Stochastic Modelling and Analysis of Degradation for Highly Reliable Products. Appl. Stoch. Models Bus. Ind. 2015, 31, 16–32. [Google Scholar] [CrossRef]
van Noortwijk, J.M. A Survey of the Application of Gamma Processes in Maintenance. Reliab. Eng. Syst. Saf. 2009, 94, 2–21. [Google Scholar] [CrossRef]
Zhao, F.; Peng, R.; Zhang, N. Inspection Policy Optimization for a K-out-of-n/C(K′,N′;F) System Considering Failure Dependence: A Case Study. Reliab. Eng. Syst. Saf. 2023, 237, 109331. [Google Scholar] [CrossRef]
Zhao, X.; Chai, X.; Sun, J.; Qiu, Q. Joint Optimization of Mission Abort and Protective Device Selection Policies for Multistate Systems. Risk Anal. 2022, 42, 2823–2834. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Nodes, elements, and the performance level in an LMCCS.

Table 1. Parameter values of the main experiment.

Parameter	Value	Interpretation
$N$	5	Element number
$L$	3	Failure threshold
$D$	3	Failure state
$k$	2	Maximal performance level
$M_{c a p a}$	2	Limit on the number of simultaneous replacements
$c_{i n s}$	5	Inspection cost
$c_{c m}$	150	Corrective maintenance cost
$c_{p m}$	20	Preventive maintenance cost
$c_{s e t}$	100	Maintenance setup cost
$c_{f a i l}$	5000	System failure cost
$γ$	2.25	Shape parameter of the gamma process
$λ (0)$	0.15/2.25	Scale parameter at performance level 0
$λ (1)$	0.64/2.25	Scale parameter at performance level 1
$λ (2)$	1.20/2.25	Scale parameter at performance level 2
$β$	0.97	Discount rate
$ε$	0.00001	Termination tolerance

Table 2. Instances in the main experiment.

Group/ Instance		Original $State (x)$	Replacement $Action (δ)$	Post-Replace $State (x^{m})$	Performance $Control (u)$	Value $Function v (x)$
G1	I1	(0,2,3,2,3)	(0,1,0,1,0)	(0,0,3,0,3)	(1,2,0,2,0)	4504.20
	I2	(0,3,2,2,3)	(0,0,1,1,0)	(0,3,0,0,3)	(2,0,1,2,0)	4504.38
	I3	(2,2,3,1,3)	(1,1,0,0,0)	(0,0,3,1,3)	(1,2,0,2,0)	4552.07
	I4	(2,3,2,3,1)	(1,0,1,0,0)	(0,3,0,3,1)	(2,0,2,0,1)	4544.96
	I5	(2,2,2,3,2)	(1,0,1,0,0)	(0,2,0,3,2)	(2,0,2,0,1)	4498.97
	I6	(2,2,3,2,3)	(1,1,0,0,0)	(0,0,3,2,3)	(1,2,0,2,0)	4624.48
G2	I7	(0,0,0,1,2)	(0,0,0,0,0)	(0,0,0,1,2)	(1,1,1,2,0)	4097.94
	I8	(2,1,2,3,2)	(1,0,0,1,0)	(0,1,2,0,2)	(1,2,0,2,0)	4438.67
	I9	(3,1,2,1,2)	(1,0,0,0,1)	(0,1,2,1,0)	(1,2,0,1,1)	4403.44
	I10	(2,1,2,2,3)	(1,0,0,1,0)	(0,1,2,0,3)	(1,2,0,2,0)	4430.72
	I11	(2,2,3,2,2)	(1,0,0,1,0)	(0,2,3,0,2)	(1,2,0,2,0)	4500.64
	I12	(1,3,0,1,1)	(0,0,0,0,0)	(1,3,0,1,1)	(2,0,1,1,1)	4291.94
	I13	(1,3,1,0,1)	(0,0,0,0,0)	(1,3,1,0,1)	(2,0,1,1,1)	4293.01
	I14	(3,1,3,2,3)	(1,0,0,1,0)	(0,1,3,0,3)	(1,2,0,2,0)	4682.21

Table 3. Instances of the comparison experiment.

Group/ Instance		Original $State (x)$	Replacement $Action (δ)$	Post-Replace $State (x^{m})$	Performance $Control (u)$	Value $Function v (x)$
G3	I15	(0,0,0,1,2)	(0,0,0,0,0)	(0,0,0,1,2)	(1,1,1,2,0)	4097.94
		(0,0,0,1,2)	(0,0,0,1,1)	(0,0,0,0,0)	(1,1,1,1,1)	4415.34
	I16	(1,1,1,1,2)	(0,0,0,0,0)	(1,1,1,1,2)	(1,1,1,2,0)	4217.31
		(1,1,1,1,2)	(1,0,0,0,1)	(0,1,1,1,0)	(1,1,1,1,1)	4536.14
	I17	(1,0,2,0,2)	(0,0,0,0,0)	(1,0,2,0,2)	(1,2,0,2,0)	4161.83
		(1,0,2,0,2)	(0,0,1,0,1)	(1,0,0,0,0)	(1,1,1,1,1)	4463.61
	I18	(0,0,1,1,2)	(0,0,0,0,0)	(0,0,1,1,2)	(1,1,1,2,0)	4133.36
		(0,0,1,1,2)	(0,0,0,1,1)	(0,0,1,0,0)	(1,1,1,1,1)	4461.68
	I19	(0,0,1,2,0)	(0,0,0,0,0)	(0,0,1,2,0)	(1,1,2,0,1)	4097.69
		(0,0,1,2,0)	(0,0,1,1,0)	(0,0,0,0,0)	(1,1,1,1,1)	4415.34

Table 4. Instances of the sensitivity analysis on maintenance capacity.

Group/ Instance		Original $State (x)$	Replacement $Action (δ)$	Post-Replace $State (x^{m})$	Performance $Control (u)$	Value $Function v (x)$
G4	I20	(2,3,2,3,1)	(1,0,1,0,0)	(0,3,0,3,1)	(2,0,2,0,1)	4544.96
		(2,3,2,3,1)	(1,1,1,1,1)	(0,0,0,0,0)	(1,1,1,1,1)	3539.64
	I21	(2,2,2,3,2)	(1,0,1,0,0)	(0,2,0,3,2)	(2,0,2,0,1)	4498.97
		(2,2,2,3,2)	(1,1,1,1,1)	(0,0,0,0,0)	(1,1,1,1,1)	3409.64
	I22	(2,2,3,2,3)	(1,1,0,0,0)	(0,0,3,2,3)	(1,2,0,2,0)	4624.48
		(2,2,3,2,3)	(1,1,1,1,1)	(0,0,0,0,0)	(1,1,1,1,1)	3539.64
G5	I23	(3,1,2,1,2)	(1,0,0,0,1)	(0,1,2,1,0)	(1,2,0,1,1)	4403.44
		(3,1,2,1,2)	(1,1,1,1,1)	(0,0,0,0,0)	(1,1,1,1,1)	3409.64
	I24	(2,1,2,2,3)	(1,0,0,1,0)	(0,1,2,0,3)	(1,2,0,2,0)	4430.72
		(2,1,2,2,3)	(1,1,1,1,1)	(0,0,0,0,0)	(1,1,1,1,1)	3409.64
	I25	(2,2,3,2,2)	(1,0,0,1,0)	(0,2,3,0,2)	(1,2,0,2,0)	4500.64
		(2,2,3,2,2)	(1,1,1,1,1)	(0,0,0,0,0)	(1,1,1,1,1)	3409.64

Table 5. Instances of the sensitivity analysis on the maintenance setup cost.

Group/ Instance		Original $State (x)$	Replacement $Action (δ)$	Post-Replace $State (x^{m})$	Performance $Control (u)$	Value $Function v (x)$
G6	I26	(0,2,1,1,1)	(0,0,0,0,0)	(0,2,1,1,1)	(2,0,1,1,1)	4161.89
		(0,2,1,1,1)	(0,1,0,0,0)	(0,0,1,1,1)	(1,1,1,1,1)	2234.32
	I27	(1,1,0,2,1)	(0,0,0,0,0)	(1,1,0,2,1)	(1,1,2,0,1)	4161.69
		(1,1,0,2,1)	(0,0,0,1,0)	(1,1,0,0,1)	(1,1,1,1,1)	2234.41
G7	I28	(0,0,1,1,2)	(0,0,0,0,0)	(0,0,1,1,2)	(1,1,1,2,0)	4133.36
		(0,0,1,1,2)	(0,0,0,0,1)	(0,0,1,1,0)	(1,1,1,1,1)	2215.47
	I29	(0,0,1,2,1)	(0,0,0,0,0)	(0,0,1,2,1)	(1,1,2,0,1)	4133.46
		(0,0,1,2,1)	(0,0,0,1,0)	(0,0,1,0,1)	(1,1,1,1,1)	2215.26
G8	I30	(0,0,1,3,1)	(0,0,0,0,0)	(0,0,1,3,1)	(1,1,2,0,1)	4255.37
		(0,0,1,3,1)	(0,0,0,1,0)	(0,0,1,0,1)	(1,1,1,1,1)	2345.26
	I31	(1,3,0,0,0)	(0,0,0,0,0)	(1,3,0,0,0)	(2,0,1,1,1)	4219.07
		(1,3,0,0,0)	(1,1,0,0,0)	(0,0,0,0,0)	(1,1,1,1,1)	2324.77

Table 6. Instances of the sensitivity analysis on the element failure cost.

Group/ Instance		Original $State (x)$	Replacement $Action (δ)$	Post-Replace $State (x^{m})$	Performance $Control (u)$	Value $Function v (x)$
G9	I32	(2,3,2,3,1)	(1,0,1,0,0)	(0,3,0,3,1)	(2,0,2,0,1)	4544.96
		(2,3,2,3,1)	(1,0,1,0,0)	(0,3,0,3,1)	(2,0,2,0,1)	3941.57
	I33	(2,2,3,2,3)	(1,1,0,0,0)	(0,0,3,2,3)	(1,2,0,2,0)	4624.48
		(2,2,3,2,3)	(1,1,0,0,0)	(0,0,3,2,3)	(1,2,0,2,0)	3977.74
G10	I34	(0,0,1,2,0)	(0,0,0,0,0)	(0,0,1,2,0)	(1,1,2,0,1)	4097.69
		(0,0,1,2,0)	(0,0,0,0,0)	(0,0,1,2,0)	(1,1,2,0,1)	3623.82
	I35	(2,1,2,3,2)	(1,0,0,1,0)	(0,1,2,0,2)	(1,2,0,2,0)	4438.67
		(2,1,2,3,2)	(1,0,0,1,0)	(0,1,2,0,2)	(1,2,0,2,0)	3889.36
G11	I36	(1,3,1,0,1)	(0,0,0,0,0)	(1,3,1,0,1)	(2,0,1,1,1)	4293.01
		(1,3,1,0,1)	(0,0,0,0,0)	(1,3,1,0,1)	(2,0,1,1,1)	3744.57
	I37	(3,1,3,2,3)	(1,0,0,1,0)	(0,1,3,0,3)	(1,2,0,2,0)	4682.21
		(3,1,3,2,3)	(1,0,0,1,0)	(0,1,3,0,3)	(1,2,0,2,0)	4001.44

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, J.; Wang, Y.; Fu, Y. Joint Optimization of Condition-Based Maintenance and Performance Control for Linear Multi-State Consecutively Connected Systems. Mathematics 2023, 11, 2724. https://doi.org/10.3390/math11122724

AMA Style

Wang J, Wang Y, Fu Y. Joint Optimization of Condition-Based Maintenance and Performance Control for Linear Multi-State Consecutively Connected Systems. Mathematics. 2023; 11(12):2724. https://doi.org/10.3390/math11122724

Chicago/Turabian Style

Wang, Jun, Yuyang Wang, and Yuqiang Fu. 2023. "Joint Optimization of Condition-Based Maintenance and Performance Control for Linear Multi-State Consecutively Connected Systems" Mathematics 11, no. 12: 2724. https://doi.org/10.3390/math11122724

APA Style

Wang, J., Wang, Y., & Fu, Y. (2023). Joint Optimization of Condition-Based Maintenance and Performance Control for Linear Multi-State Consecutively Connected Systems. Mathematics, 11(12), 2724. https://doi.org/10.3390/math11122724

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Joint Optimization of Condition-Based Maintenance and Performance Control for Linear Multi-State Consecutively Connected Systems

Abstract

1. Introduction

2. Literature Review

2.1. Literature Regarding Linear Multi-State Consecutively Connected Systems

2.2. Literature Regarding Degradation Management

2.3. Review of Methodology

3. Problem Description

3.1. Degradation of the Elements

3.2. CBMP Policy and Related Costs

4. Markov Decision Process Formulation

4.1. State Space, Action Space, and Transition Probabilities

4.2. Value Function Formulation

4.3. Policy Iteration Algorithm

5. Numerical Experiments

5.1. Main Experiment

5.1.1. Parameters of the Main Experiment

5.1.2. Reverse Balance and Balance in Degradation Management

5.2. Comparison Experiment

5.3. Sensitivity Analysis

5.3.1. Sensitivity Analysis concerning Maintenance Capacity

5.3.2. Sensitivity Analysis concerning Maintenance Setup Costs

5.3.3. Sensitivity Analysis concerning the Difference between Preventive and Corrective Replacement Costs

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI