Reliability Evaluation for Clustered WSNs under Malware Propagation

We consider a clustered wireless sensor network (WSN) under epidemic-malware propagation conditions and solve the problem of how to evaluate its reliability so as to ensure efficient, continuous, and dependable transmission of sensed data from sensor nodes to the sink. Facing the contradiction between malware intention and continuous-time Markov chain (CTMC) randomness, we introduce a strategic game that can predict malware infection in order to model a successful infection as a CTMC state transition. Next, we devise a novel measure to compute the Mean Time to Failure (MTTF) of a sensor node, which represents the reliability of a sensor node continuously performing tasks such as sensing, transmitting, and fusing data. Since clustered WSNs can be regarded as parallel-serial-parallel systems, the reliability of a clustered WSN can be evaluated via classical reliability theory. Numerical results show the influence of parameters such as the true positive rate and the false positive rate on a sensor node’s MTTF. Furthermore, we validate the method of reliability evaluation for a clustered WSN according to the number of sensor nodes in a cluster, the number of clusters in a route, and the number of routes in the WSN.


Introduction
Wireless Sensor Networks (WSNs) play an important role in daily life, as numerous modern information systems rely on WSNs, which consist of many sensor nodes with limited computation, storage, and communication resources. WSN applications include environmental, highway, and patient health monitoring as well as other commercial uses [1]. To realize these applications, the research community has focused on ensuring the reliability of WSNs. By definition, reliability reflects the ability of a system or component to perform its required functions under stated conditions for a specified period of time. Due to the nature of data collection in WSNs, this ability has been a major challenge in applying WSNs for successful monitoring. All sensor nodes need to send their sensed data towards the sink. Hence, packet loss due to transmission errors, packet collisions, interference, node failures, and malicious attacks is common [2]. Therefore, reliability evaluation for WSNs is vital in order to guarantee the delivery of sensed data from sensor nodes to the sink. In addition, Our main contributions are as follows: (1) We relate the intent of malware infection to the CTMC's randomness by introducing a strategic game that can predict malware's infection behavior. In this manner, state transitions of a sensor node that arise from malware actions can be modeled by the CTMC; and (2) We propose a novel measure to compute the Mean Time to Failure (MTTF) of a sensor node, which represents the reliability of a sensor node continuously performing tasks such as sensing, transmitting, and fusing data. Thus, we can deduce the reliability of a cluster, a route, and a clustered WSN from the perspective of a parallel-serial-parallel system, respectively. This method of reliability evaluation for clustered WSNs under epidemic-malware propagation can help establish theoretical foundations that guide rules for applying reliability techniques. Consequently, WSNs that guarantee reliable delivery of sensor nodes' sensed data may be realized.
The rest of this article is organized as follows: we first review related work and highlight the salient features of our approach in Section 2. We describe infections as state transitions from the view of a CTMC in Section 3. We obtain the infection probability by introducing a strategic malware-infection game in Section 4. We propose measures of reliability evaluation for clustered WSNs under the scenario of epidemic-malware propagation in Section 5. We validate the proposed measures' efficacy in Section 6. Finally, we conclude the article in Section 7.

Related Work
Based on classical epidemic models, many extended studies have been performed to describe the characteristics of malware propagation in WSNs. In a good survey, Yu et al. [13] presented current works on modeling malware propagation. Generally, sensor nodes periodically enter sleep mode to save energy. Typical models reflecting nodes sleeping during malware propagation include EiSIRS [14], a modified SI model [15], a modified SIS model [16], and Shen's model [17]. Moreover, sensor nodes "die" due to energy exhaustion or intentional destruction by malware. Thus, a dead state was introduced in iSIRS [18] based on the SIR model and Shen's model [17]. Furthermore, a reaction-diffusion-theoretic model [19], a pulse-differential-equation-based SIR model [20], and a susceptible-exposed-infected-recovered-susceptible model [21] were proposed in order to foresee spatial distribution and the temporal dynamics characteristic of malware propagation. In addition, Yu et al. [22] proposed a two-layer malware propagation model that better represents malware propagation in large-scale networks compared with existing single-layer epidemic models. Keshri and Mishra [23] presented a susceptible-exposed-infectious-recovered model with two time delays for charactering the transmission dynamics of malware propagation. Zhu and Zhao [24] explored a SIR-based nonlinear malware propagation model in WSNs. Wang et al. [25] presented a survey on modeling malware propagation in networks including WSNs. Other typical models [26][27][28][29] address the problem of malware propagation in multi-hop networks, which can help illustrate malware propagation in WSNs.
Several authors have considered decision-making dilemmas arising during malware propagation. Khouzani et al. [30] established a zero-sum dynamic game between the network system and the malware, given that malware can dynamically alter infection parameters based on the network system's dynamics. Jin et al. [31] used an evolutionary game to construct a malware propagation model under bounded rationality, where the game is to predict the trend of malware's evolutionary infection. Spyridopoulos et al. [32] employed a complete information game to obtain the defender's optimal strategy that minimizes the security cost as well as the malware effect. In addition, Trajanovski et al. [33] found decentralized optimal protection strategies for the network system by proposing a game-theoretic framework and seeking its Nash equilibria and the Price of Anarchy.
Several methods have been proposed using various techniques to cope with the challenge of evaluating WSNs' reliability. In a pioneering work [34], the authors employed a probabilistic graph to represent WSNs given a failure probability estimation of sensor nodes. Kar et al. [35] modeled WSNs' energy reliability assuming Markovian sensor discharge/recharge periods. Distefano [36] used dynamic reliability block diagrams to represent static structural interactions between sensor nodes, where sleep/wake-up standby policies and interference are considered dynamically. Based on [36], he further gave a reliability evaluation model integrating Petri nets [37]. Silva et al. [38] proposed an evaluation methodology supporting arbitrary failure conditions based on automatically generated fault trees. Niyato et al. [39] proposed reliability analysis of wireless communications systems in the smart grid, which is also suitable for WSNs. Kamal et al. [40] developed a novel framework called Packet-Level Attestation for sensor data reliability evaluation using the spatial relationship among data sensed by nearby sensor nodes. In [41] Dâmaso et al. proposed a reliability evaluation model based on routing algorithms in WSNs and sensor nodes' various battery levels. According to MAC protocols adopted in WSNs, Wang et al. [42] evaluated a sensor node's reliability under three typical working scenarios including sensor nodes always in active mode, alternating between sleep and active modes on average, and alternating between these modes based on a certain distribution. Zonouz et al. [43] evaluated the reliability of energy harvesting sensor nodes and battery-powered sensor nodes as well as develop the corresponding wireless-link-reliability models. Cai et al. [44] characterized event-driven WSNs according to limited node battery energy and shadowing under channel fading, obtaining reliable data flows in WSNs via wireless link reliability and node energy availability. Wang et al. [45] analyzed a body sensor node's reliability subject to probabilistic competition between propagation effects and probabilistic failure isolation. Yan et al. [46] proposed a symbolic ordered-binary-decision-diagram-multicast method to evaluate the reliability of multicast WSNs. Zhu et al. [47] proposed mission-oriented and transmission-paths-based models for evaluating WSNs' transmission reliability. Other measures closely related to reliability include dependability and survivability such as the stochastic-activity-network-based dependability measure [48], the epidemic-theory-based survivability measure [49,50], survivability analysis using probabilistic model checking [51], and natural-tenacity-based survivability evaluation for mobile WSNs [52].
Unlike this body of work, to the best of our knowledge this work is the first that concentrates on reliability evaluation for clustered WSNs under epidemic-malware propagation conditions. While our epidemic model is similar to [14,17], we further model all state transitions of a sensor node as a CTMC by integrating the malware's infection probability predicted by our strategic malware-infection game into the transition probability. We apply the approach proposed in [53] to compute the MTTF from a Markov process; however, we further find the equation to compute the reliability of a sensor node under epidemic-malware propagation, which is a novel measure. Considering the topology of a typical clustered WSN, we can thus deduce the reliability of a cluster, a route, and a clustered WSN based on reliability theory. In summary, we propose a unified framework for malware-infection and reliability evaluation that integrates both security and reliability properties of a clustered WSN during the evaluation process.

Modeling State Transitions of a Sensor Node as a CTMC
The various epidemic-malware propagation models mentioned above are actually state transition models. These states are mutually exclusive: a sensor node is in exactly one state at any time. During its lifecycle, a sensor node interchanges among different states. Figure 1 illustrates a CTMC indicating state transitions of a sensor node, where p ij , i, j P tS, R, and D denote states Susceptible, Susceptible while sleeping, Infected, Infected while sleeping, Recovered, Recovered while sleeping, and Dead, respectively. Each sensor node's characteristics determine its state. State S denotes a sensor node that works normally and is not infected by malware, but it is susceptible to malware. " S denotes a sleeping sensor node that malware cannot infect although the node is susceptible. I denotes a sensor node that has been infected by malware and may propagate malware to neighboring nodes with which it communicates as it is under the malware's control.

"
I denotes an infected sensor node that is sleeping; thus, it cannot propagate malware. R denotes a "recovered" node that can be "immunized" to current malware, but not to future malware. " R denotes a recovered node that is sleeping. D denotes a node that is unusable because malware exhausted its energy, deliberately destroyed it, or both.
In practice, behaviors of both sensor nodes and malware trigger each state transition. For each sensor node, installing security patches is a normal method to either prevent susceptible nodes from known malware by fixing bugs or to remedy an infected node and immunize it to known malware. Thus, this action results in the state transitions S Ñ R and I Ñ R . In general, each node is scheduled to sleep, which saves energy, or to awaken, which leads to the state transitions S Ñ  In practice, behaviors of both sensor nodes and malware trigger each state transition. For each sensor node, installing security patches is a normal method to either prevent susceptible nodes from known malware by fixing bugs or to remedy an infected node and immunize it to known malware. Thus, this action results in the state transitions SR  and IR  . In general, each node is scheduled to sleep, which saves energy, or to awaken, which leads to the state transitions SS  or SS  . The state transitions II  or II  as well as RR  or RR  follow likewise. When confronting unknown malware, a sensor node usually lacks immunity, and the state transition RS  follows. In addition, the state transitions SD  , SD  , ID  , ID  , RD  , and RD  take place when malware exhausts a sensor node's energy. On the other hand, contamination by malware leads to the state transition SI  . Moreover, malware can deliberately destroy an infectious sensor node besides the case of death from energy exhaustion; hence, the state transition ID  occurs. However, the state transition SI  cannot be formulated as a stochastic process because the infection behavior resulting in this transition is deliberate. Even though the time to execute action Infect is stochastically distributed, the decision to execute the actual infection is not. Since malware decides whether to infect a susceptible sensor node, there is an infection probability that malware determines to execute infection. We let I  be the probability that the malware will choose action Infect to propagate itself and  be the probability of a successful infection in order to formalize the malware's decision. Thus, the infection rate (i.e., the state transition rate) from state S to state I becomes: By introducing the infection probability I  , we can model the consequence of a successful infection as one deliberate state transition of the CTMC, which describes the dynamics of a sensor node's behavior. In this manner, our modeling approach concentrates on the higher-level effects of infection on a sensor node rather than the lower-level specific infection procedures.

A Strategic Malware-Infection Game to Obtain the Infection Probability
We employ a strategic game to predict the malware's infection behavior, namely obtaining the probability I  . We adopt game-theoretic analysis as the following dilemma arises: the malware attempts to infect as many sensor nodes as possible without detection, whereas the defender attempts to enhance the network robustness by detecting more malware. Facing this dilemma, we introduce players malware and system to play the strategic game. Even though there are various kinds of malware whose goals may be to eavesdrop on private sensed data or to disable communication among sensor nodes, it is sufficient to regard all malware as player malware due to malware's similar motivations and skills. Player system, the opponent of malware, actually corresponds to IDSes residing in WSNs.
In practice, iterations of the strategic game can be depicted as follows. There are discrete periods However, the state transition S Ñ I cannot be formulated as a stochastic process because the infection behavior resulting in this transition is deliberate. Even though the time to execute action Infect is stochastically distributed, the decision to execute the actual infection is not. Since malware decides whether to infect a susceptible sensor node, there is an infection probability that malware determines to execute infection. We let ρ I be the probability that the malware will choose action Infect to propagate itself and λ be the probability of a successful infection in order to formalize the malware's decision. Thus, the infection rate (i.e., the state transition rate) from state S to state I becomes: By introducing the infection probability ρ I , we can model the consequence of a successful infection as one deliberate state transition of the CTMC, which describes the dynamics of a sensor node's behavior. In this manner, our modeling approach concentrates on the higher-level effects of infection on a sensor node rather than the lower-level specific infection procedures.

A Strategic Malware-Infection Game to Obtain the Infection Probability
We employ a strategic game to predict the malware's infection behavior, namely obtaining the probability ρ I . We adopt game-theoretic analysis as the following dilemma arises: the malware attempts to infect as many sensor nodes as possible without detection, whereas the defender attempts to enhance the network robustness by detecting more malware. Facing this dilemma, we introduce players malware and system to play the strategic game. Even though there are various kinds of malware whose goals may be to eavesdrop on private sensed data or to disable communication among sensor nodes, it is sufficient to regard all malware as player malware due to malware's similar motivations and skills. Player system, the opponent of malware, actually corresponds to IDSes residing in WSNs.
In practice, iterations of the strategic game can be depicted as follows. There are discrete periods of time in which player malware launches infection. Player system intends to prevent infection in any of these periods. In each period, each player has two actions: player malware can either infect or not infect while player system can either defend or not defend. But each player can choose only one action with either pure or mixed strategies. If player malware takes no action and player system does not defend, then the game enters the next stage. Next, we formally define our strategic game and we explore the game entirely from malware's view, since the target of our strategic game is to predict the infection intention of player malware and not to obtain the optimal defense strategies for player system.

Definition 1.
The Strategic Malware-Infection Game (SMIG) is formulated by a 4-tuple Let ρ I and ρ φ be the probabilities that player malware adopts actions Infect and Non-infect, respectively. Let δ D and δ φ be the probabilities that player system adopts actions Defend and Non-defend, respectively. Accordingly, infection strategy ρ and defense strategy δ are mixed strategies pρ I , ρ φ q and pδ D , δ φ q, which represent the probability distributions over action sets A M and A S , respectively. Both of them certainly satisfy ρ I`ρφ " 1 and δ D`δφ " 1. Actually, the infection probability ρ I describes the degree of infection for a sensor node (equivalently, the aggressiveness of player malware targeting a sensor node). The larger ρ I is, the greater the probability of action Infect and, hence, the larger the corresponding infection rate for a sensor node. Now we consider the payoff matrix to explore malware's motivation. For simplicity, we denote the worth of a sensor node by ω, where ω ą 0. Actually, ω is equivalent to a degree of damage such as the loss of sensed data, loss due to compromise, and so on. If player malware infects successfully, it will obtain payoff ω and its opponent will obtain payoff´ω. On the contrary, if player system succeeds in defense, its payoff is ω because it has protected a sensor node worth ω and player malware will be penalized by ω. However, no IDS can entirely detect all current and future malware: all IDSes have true positive rates and false positive rates. Next, we consider these two rates as defining the payoffs of malware and system and we let α and β be the true positive rate and the false positive rate of the WSN IDS, respectively. We also let c I and c D be the cost of player malware infecting a susceptible sensor node and player system detecting the malware's infection, respectively. Obviously, there are four possible payoffs constructing the payoff matrix, since each of malware and system has two possible actions. For the action profile (Infect, Defend), player malware will obtain gain p1´αqω from detection failure as well as loss αω from being detected successfully and loss c I from adopting action Infect. Thus, the payoff of player malware, u I ID , is: On the other hand, player system obtains gain αω as well as loses p1´αqω from detection failure, βω from false positive detection, and c D from detecting malware's infection. Thus, the payoff of player system, u S ID , is: For the action profile (Infect, Non-defend), player malware obtains gain λω from successful infection and loses c I from adopting action Infect. Thus, the payoff of player malware, u I Iφ , is: whereas the payoff of player system, u S Iφ , is: For the action profile (Non-infect, Defend), the payoff of player malware, u I φD , is: whereas the payoff of player system, u S φD , is: Finally, for the action profile (Non-infect, Non-defend), since neither player malware nor player system can obtain any gain or produce any loss, the payoff of player malware, u I φφ , is: and the payoff of player system, u S φφ , is: The objective of player malware is to maximize its expected infection utility, whereas the objective of player system is to minimize its expected defense utility. This objective can be achieved by solving the mixed-strategy Nash Equilibrium (NE) of the strategic game. Theorem 1. In the SMIG, the optimal probability of player malware choosing action Infect is: Proof: Under player malware's mixed strategy, player system's expected payoffs for choosing actions Defend and Non-defend are: and: E S pNon´de f endq " ρ I p´λωq`p1´ρ I q¨0 "´ρ I λω (12) respectively. From the indifference between actions Defend and Non-defend under the optimal mixed strategy of player system, we obtain: Therefore, the optimal probability of player malware choosing action Infect is: Obtaining the optimal infection probability ρI means that we acquire the indication of the expected infection behavior of player malware for WSNs under epidemic-malware propagation. In other words, malware will choose action Infect with probability ρI . When following ρI , player malware has no reason to adjust its strategy as it has maximized its expected utility from the infection regardless of the success of its actions.

Evaluating the Reliability of a Sensor Node
The reliability of WSNs represents the probability that sensor nodes continue to perform tasks such as data sensing, transmission, and fusion over a particular period of time under stated conditions. In general, MTTF and MTBF (Mean Time between Failures) are typical ways to evaluate the reliability of pieces of hardware or other technology. Here, MTTF refers to the length of time that a device is expected to last in operation, whereas MTBF is the average elapsed time between a device's failures in operation. The difference between two terms is that MTTF is used for non-repairable devices, whereas MTBF is used for devices that can be repaired and returned to operation. In this work, we concentrate on WSNs where sensor nodes can hardly be repaired upon failure and we use MTTF for evaluating the reliability of a sensor node.
We denote E as the discrete state space where: as illustrated in Figure 1. Let: be the state probability vector, where Y x ptq denotes the probability that a sensor node is in state x, x P E , at time t. Let P be the 7ˆ7 state transition matrix consisting of element p ij , i, j P E . We find the state equation of a sensor node as: We find the steady-state probability vector that is independent of Yp0q (i.e., the initial state): by solving the system of seven equations where six of the seven equations are from: and the seventh equation is: ÿ Next, we compute a sensor node's MTTF from the steady-state probability vector. The discrete state space E can be split into two disjoint sets E Use and E Disuse , where: and: E Disuse " t " S , I, denote the set of usable and unusable states, respectively. Correspondingly, the state transition matrix P can be rewritten as: where:  (25) and: Likewise, we split the steady-state probability vector into two disjoint parts Y Use and Y Disuse , where: and: According to the method provided in [53] to compute the MTTF from a Markov process, we find the MTTF of a sensor node η as: where Y Use p0q denotes the initial usable state probability vector (i.e., with t " 0) computed as: and I denotes a column vector of two ones: Let Reliability node ptq be the reliability of a sensor node at time t. We assume that all sensor nodes, due to their similarity, have the same MTTF. From reliability theory, we find that:

Evaluating Reliability of a Clustered WSN
We aim to perform reliability evaluations for clustered WSNs due to their popularity. Figure 2 illustrates the topology of a clustered WSN. Coordinating cluster heads (CHs) control the topology where each CH guides a different cluster of sensor nodes. Such an architecture leads to a two-tier hierarchy where the upper tier comprises CHs and the lower tier comprises sensor nodes. Sensor nodes in specific regions transmit their sensed data to the responsible CH that manages the nodes. The responsible CH collects the data, which are transmitted to the single base station via other CHs. In this way, we can relate each cluster to a parallel system and each set of clusters to a serial system. Therefore, any route from a sensor node to the base station can be naturally regarded as a serial-parallel path. Since there are various routes through which sensed data can be transferred, a clustered WSN can be mapped correspondingly to a parallel-serial-parallel system where each sensor node fails independently. Each cluster operates if at least one of its sensor nodes works normally. However, for any route, all of its clusters must work normally for proper operation. As a result, we can evaluate the reliability of a clustered WSN based on the MTTF of a sensor node from the perspective of classical reliability theory.

Evaluating Reliability of a Clustered WSN
We aim to perform reliability evaluations for clustered WSNs due to their popularity. Figure 2 illustrates the topology of a clustered WSN. Coordinating cluster heads (CHs) control the topology where each CH guides a different cluster of sensor nodes. Such an architecture leads to a two-tier hierarchy where the upper tier comprises CHs and the lower tier comprises sensor nodes. Sensor nodes in specific regions transmit their sensed data to the responsible CH that manages the nodes. The responsible CH collects the data, which are transmitted to the single base station via other CHs. In this way, we can relate each cluster to a parallel system and each set of clusters to a serial system. Therefore, any route from a sensor node to the base station can be naturally regarded as a serial-parallel path. Since there are various routes through which sensed data can be transferred, a clustered WSN can be mapped correspondingly to a parallel-serial-parallel system where each sensor node fails independently. Each cluster operates if at least one of its sensor nodes works normally. However, for any route, all of its clusters must work normally for proper operation. As a result, we can evaluate the reliability of a clustered WSN based on the MTTF of a sensor node from the perspective of classical reliability theory.
Finally, a clustered WSN composed of available routes is a parallel system. Therefore, the reliability of a clustered WSN at time t , () Reliability t , is: So far, we have developed a method of reliability evaluation for clustered WSNs under epidemic-malware propagation conditions. In practice, we suggest using the proposed method in off-line mode, since performing on-line reliability evaluation for clustered WSNs is very difficult. First, we define our strategic game mathematically and solve it analytically. We integrate the results of the game with the transition probability upon infection of a sensor node from which we compute the MTTF of a sensor node. As soon as the topology of a clustered WSN is determined based on the actual requirements, we can find the number of sensor nodes in a cluster, the number of clusters in a route, and the number of routes in a clustered WSN. Based on these numbers, we can compute the reliability of a clustered WSN from Equation (35). In fact, our way is popular in the field of using the game theoretical approaches, which is easy to realize. Since each cluster is a parallel system, the reliability of a cluster at time t, Reliability cluster ptq, can be computed as: Reliability cluster ptq " 1´ź nodePcluster p1´Reliability node ptqq (33) Moreover, any route composed of clusters is a serial system, and thus the reliability of a route at time t, Reliability route ptq, can be computed as: Finally, a clustered WSN composed of available routes is a parallel system. Therefore, the reliability of a clustered WSN at time t, Reliabilityptq, is: So far, we have developed a method of reliability evaluation for clustered WSNs under epidemic-malware propagation conditions. In practice, we suggest using the proposed method in off-line mode, since performing on-line reliability evaluation for clustered WSNs is very difficult. First, we define our strategic game mathematically and solve it analytically. We integrate the results of the game with the transition probability upon infection of a sensor node from which we compute the MTTF of a sensor node. As soon as the topology of a clustered WSN is determined based on the actual requirements, we can find the number of sensor nodes in a cluster, the number of clusters in a route, and the number of routes in a clustered WSN. Based on these numbers, we can compute the reliability of a clustered WSN from Equation (35). In fact, our way is popular in the field of using the game theoretical approaches, which is easy to realize.

Illustrating Influence of α and β
With MATLAB R2010b, we explore how the optimal infection probability and the MTTF of a sensor node depend on the parameters of the true positive rate (i.e., the detection rate) and the false positive rate (i.e., the false alarm rate). The parameters of the strategic game are ω " 50, c I " 20, c D " 10, and λ " 0.5. Note that we can attain similar trends if the parameters are changed. However, specific values will be correspondingly changed. Figures 3 and 4 demonstrate the changing optimal infection probabilities that player malware adopts according to α and β, which reveals the malware's intention. Obviously, a higher detection rate and a lower false-alarm rate can help a WSN IDS detect malware. Therefore, player malware will choose its optimal strategy to lower the infection probability in order to minimize the loss arising from IDS detection. As shown in Figure 3, the optimal infection probability decreases gradually when the true positive rate increases slowly from 0.7 to 0.98. Moreover, a lower false positive rate results in a lower infection probability. For example, when α " 0.88 in Figure 3, the optimal infection probabilities are~0.1667,~0.1984, and~0.2381 for β " 0.01, β " 0.05, and β " 0.1, respectively. We observe in Figure 4 the optimal infection probability increases gradually when the false positive rate increases slowly from 0.01 to 0.15. Furthermore, a higher detection rate leads to a lower infection probability. For example, when β " 0.1 in Figure 4, the optimal infection probabilities are~0.2727,~0.2308, and 0.2055 for α " 0.8, α " 0.9, and α " 0.98, respectively. These experimental results indicate that the true positive rate should be increased and the false positive rate should be decreased in order to decrease the infection probability adopted by player malware and to enhance a sensor node's reliability.

Illustrating Influence of  and 
With MATLAB R2010b, we explore how the optimal infection probability and the MTTF of a sensor node depend on the parameters of the true positive rate (i.e., the detection rate) and the false positive rate (i.e., the false alarm rate). The parameters of the strategic game are 50 . Note that we can attain similar trends if the parameters are changed. However, specific values will be correspondingly changed. Figures 3 and 4 demonstrate the changing optimal infection probabilities that player malware adopts according to  and  , which reveals the malware's intention. Obviously, a higher detection rate and a lower false-alarm rate can help a WSN IDS detect malware. Therefore, player malware will choose its optimal strategy to lower the infection probability in order to minimize the loss arising from IDS detection. As shown in Figure 3, the optimal infection probability decreases gradually when the true positive rate increases slowly from 0. , respectively. These experimental results indicate that the true positive rate should be increased and the false positive rate should be decreased in order to decrease the infection probability adopted by player malware and to enhance a sensor node's reliability.

Illustrating Influence of  and 
With MATLAB R2010b, we explore how the optimal infection probability and the MTTF of a sensor node depend on the parameters of the true positive rate (i.e., the detection rate) and the false positive rate (i.e., the false alarm rate). The parameters of the strategic game are 50 . Note that we can attain similar trends if the parameters are changed. However, specific values will be correspondingly changed. Figures 3 and 4 demonstrate the changing optimal infection probabilities that player malware adopts according to  and  , which reveals the malware's intention. Obviously, a higher detection rate and a lower false-alarm rate can help a WSN IDS detect malware. Therefore, player malware will choose its optimal strategy to lower the infection probability in order to minimize the loss arising from IDS detection. As shown in Figure 3, the optimal infection probability decreases gradually when the true positive rate increases slowly from 0. , respectively. These experimental results indicate that the true positive rate should be increased and the false positive rate should be decreased in order to decrease the infection probability adopted by player malware and to enhance a sensor node's reliability.    Figure 5 shows the MTTF of a sensor node under the epidemic-malware propagation scenario in terms of α and β. As the true positive rate increases and the false positive rate decreases, it is easy for player system to detect infected sensor nodes, increasing their MTTFs. From Figure 5, as we expect, the MTTF increases slowly when the true positive rate increases gradually from 70% to 98%. There is a similar tendency when the false positive rate decreases from 15% to 1%. Smaller decreases of the false positive rate increase the MTTF for a sensor node more than the same decreases of the true positive rate. For example, the MTTF of a sensor node increases from~4.6366 to~5.7219 (an increase of~23.41%) as β drops from 15% to 1% when α " 88%. However, when β " 10%, the MTTF of a sensor node increases from~4.2487 to~5.0350 (an increase of~18.51%) as α increases from 70% to 90%. These results indicate that we should further reduce the false positive rate while improving IDSes for WSNs in order to increase a sensor node's MTTF. Figure 5 shows the MTTF of a sensor node under the epidemic-malware propagation scenario in terms of  and  . As the true positive rate increases and the false positive rate decreases, it is easy for player system to detect infected sensor nodes, increasing their MTTFs. From Figure 5, as we expect, the MTTF increases slowly when the true positive rate increases gradually from 70% to 98%. There is a similar tendency when the false positive rate decreases from 15% to 1%. Smaller decreases of the false positive rate increase the MTTF for a sensor node more than the same decreases of the true positive rate. For example, the MTTF of a sensor node increases from ~4.6366 to ~5.7219 (an increase of ~23.41%) as  drops from 15% to 1% when 88%   . However, when 10%   , the MTTF of a sensor node increases from ~4.2487 to ~5.0350 (an increase of ~18.51%) as  increases from 70% to 90%. These results indicate that we should further reduce the false positive rate while improving IDSes for WSNs in order to increase a sensor node's MTTF.

Validating the Method of Reliability Evaluation for a Clustered WSN
Next, from the perspective of a clustered WSN, we evaluate its reliability according to the number of sensor nodes in a cluster, the number of clusters in a route, and the number of routes in a clustered WSN, as illustrated in Figures 6-8, respectively. Figure 6 illustrates varying degrees of reliability for a clustered WSN when there are two, four, and six sensor nodes in a cluster, respectively. When both the number of clusters in a route and the number of routes in a clustered WSN are static, the reliability of the WSN increases with the number of sensor nodes in a cluster. With two, four, and six nodes in a cluster, it takes about six, nine, and eleven days, respectively, to reduce the reliability of a clustered WSN to 0.5 under the epidemic-malware propagation scenario. Figure 7 illustrates varying degrees of reliability for a clustered WSN when there are two, four, and six clusters in a route, respectively. When both the number of sensor nodes in a cluster and the number of routes in a clustered WSN are static, the reliability of the WSN decreases with the number of clusters in a route. With two, four, and six clusters in a route, it takes about eleven, eight, and seven days, respectively, to reduce the reliability of a clustered WSN to 0.5 under the epidemic-malware propagation scenario. Figure 8 illustrates varying degrees of reliability for a clustered WSN when there are two, four, and six routes in a clustered WSN, respectively. When both the number of sensor nodes in a cluster and the number of clusters in a route are static, the reliability of the WSN increases with the number of routes in the clustered WSN. With two, four, and six routes in the clustered WSN, it takes about seven, eight, and nine days, respectively, to reduce the reliability of the clustered WSN to 0.5 under the epidemic-malware propagation scenario.
In summary, the experimental results shown in Figures 6-8 indicate that deploying more redundant sensor nodes in a cluster, deducing the clusters along constructed routes, and providing more available routes all help improve the reliability of a clustered WSN, which accords with our expectations.

Validating the Method of Reliability Evaluation for a Clustered WSN
Next, from the perspective of a clustered WSN, we evaluate its reliability according to the number of sensor nodes in a cluster, the number of clusters in a route, and the number of routes in a clustered WSN, as illustrated in Figures 6-8, respectively. Figure 6 illustrates varying degrees of reliability for a clustered WSN when there are two, four, and six sensor nodes in a cluster, respectively. When both the number of clusters in a route and the number of routes in a clustered WSN are static, the reliability of the WSN increases with the number of sensor nodes in a cluster. With two, four, and six nodes in a cluster, it takes about six, nine, and eleven days, respectively, to reduce the reliability of a clustered WSN to 0.5 under the epidemic-malware propagation scenario. Figure 7 illustrates varying degrees of reliability for a clustered WSN when there are two, four, and six clusters in a route, respectively. When both the number of sensor nodes in a cluster and the number of routes in a clustered WSN are static, the reliability of the WSN decreases with the number of clusters in a route. With two, four, and six clusters in a route, it takes about eleven, eight, and seven days, respectively, to reduce the reliability of a clustered WSN to 0.5 under the epidemic-malware propagation scenario. Figure 8 illustrates varying degrees of reliability for a clustered WSN when there are two, four, and six routes in a clustered WSN, respectively. When both the number of sensor nodes in a cluster and the number of clusters in a route are static, the reliability of the WSN increases with the number of routes in the clustered WSN. With two, four, and six routes in the clustered WSN, it takes about seven, eight, and nine days, respectively, to reduce the reliability of the clustered WSN to 0.5 under the epidemic-malware propagation scenario.
In summary, the experimental results shown in Figures 6-8 indicate that deploying more redundant sensor nodes in a cluster, deducing the clusters along constructed routes, and providing more available routes all help improve the reliability of a clustered WSN, which accords with our expectations.

Conclusions
We have performed reliability analysis on clustered WSNs under the epidemic-malware propagation scenario and developed a corresponding measure of reliability evaluation in order to establish a kind of highly reliable WSN. We have determined how to relate the intent of malware infection to the randomness of CTMCs using a strategic game that can predict malware's infection behavior. We have proposed the MTTF to reflect the reliability of a sensor node and we have regarded clustered WSNs as a parallel-serial-parallel system. Using this approach, we have obtained equations to compute the reliability of a cluster, a route, and a clustered WSN, respectively. As a result, we have provided a foundation for the mechanism of reliability evaluation for susceptible WSNs. Our experiments have shown the importance of reducing the false positive rate rather than the true positive rate in order to increase MTTFs for susceptible sensor nodes. We have also validated the efficacy of our proposed measure of reliability evaluation for susceptible WSNs.
We have assumed that the topology of a clustered WSN only consists of a single sink node; however, actual clustered WSNs may have several sink nodes. Furthermore, the topology of a clustered WSN may be changed once mobile sensor nodes are introduced. Under these circumstances, the equation to compute the reliability of the clustered WSN will be more complicated. Providing such a reliability evaluation method is an interesting research direction when the assumption is relaxed. Moreover, providing measures of availability, dependability, and survivability for WSNs under malware propagation is another interesting direction.