#
Application of the D3H2 Methodology for the Cost-Effective Design of Dependable Systems^{ †}

^{1}

^{2}

^{3}

^{4}

^{5}

^{*}

^{†}

*International Conference on Probabilistic Safety Assessment and Management (PSAM) 12*, 2014.

## Abstract

**:**

## 1. Introduction

## 2. Related Work

## 3. D3H2 Methodology

- The Functional Modelling Approach specifies the functional model including system functions, the physical location in which these functions are performed, and a necessary list of resources to develop these functions (see SubSection 3.1).
- The Compatibility Analysis identifies compatible implementations (i.e., redundancies) in the functional model. To use these compatibilities, it may be necessary to aggregate additional resources and perform the allocation activity for the new elements. Subsequently, reconfiguration strategies are defined including all implementations and their priorities (see SubSection 3.2).
- To use homogeneous and heterogeneous redundancies in highly networked scenarios, it is necessary to extend the functional model with fault detection and reconfiguration functions. To perform these functions, it also is necessary to allocate hardware/software (HW/SW) resources to the system functions. Accordingly, the extended HW/SW architecture is designed via an Extended Functional Modelling Approach (see SubSection 3.3).
- Finally, the Dependability Evaluation Modelling Approach predicts the dependability of the extended HW/SW architecture. The dependability and cost analyses allow designers to decide on design variants that achieve the best trade-offs between dependability and cost (see SubSection 3.4).

#### 3.1. Functional Modelling Approach

_{A}, Zone

_{B}).

_{A}of the Car

_{1}in the train. It is comprised of fire detection, fire protection control and alarm subfunctions, and the implementation of the fire detection subfunction is comprised of a fire detection sensor and a PU

_{FP}processing unit (#1).

#### 3.2. Compatibility Analysis

_{1}].Zone

_{A}↔ [Train].[Car

_{1}].Zone

_{B}); or (3) physical locations that span other PLs ([Train].[Car

_{1}].[Zone

_{A}] → [Train].[Car

_{1}].[Zone

_{A}].Door).

#### 3.3. Extended Functional Modelling Approach

- Fault detection (FD): each subfunction has an associated fault detection subfunction (FD_SF). The FD_SF is located at the destination processing unit where the information of the source processing unit is used to detect communication omission failures directly.
- Reconfiguration (R): each subfunction has its own reconfiguration subfunction (R_SF), which receives fault detection (FD_SF) signals and sends reconfiguration signals to subfunction implementations.
- Fault detection of the reconfiguration (FD_R): each reconfiguration implementation (R_SF) has its own fault detection mechanism (FD_R_SF) implemented in keepalive configuration. Each R_SF implementation sends keepalive signals to all their FD_R_SF implementations to indicate that it is operating. In the absence of a keepalive signal during a time-slot, an R_SF implementation is assumed to be failed. When this happens, the FD_R_SF implementation sends an activation signal to the available R_SF implementation with the highest priority.
- Communication is considered at resource level.

#### 3.4. Dependability Evaluation Modelling Approach

#### 3.4.1. Preliminaries on Component Dynamic Fault Trees

- Modular specification to improve clarity, maintainability, and traceability to the design architecture. Embed the failure logic of a set of events or components and (re)use it where needed.
- Temporal logic to capture the system failure logic accounting for time-ordered events.
- Specification of any cumulative distribution function of failure events.
- Specification of repeated basic events, subsystems, or components.
- Specification of NOT gates to address the influence of functional events.

_{2}) and CDFT gates. Each component (C

_{1}, C

_{2}) may have gates, basic events and/or other components as inputs. Each basic event (BE

_{1}, …, BE

_{6}) is specified according to its cumulative distribution function and its failure rates.

**Definition 1.**

**Component Dynamic Fault Tree**: the Component Dynamic Fault Tree model, CDFT, is a 4-tuple $<N,\phantom{\rule{0.277778em}{0ex}}G,\phantom{\rule{0.277778em}{0ex}}SC,\phantom{\rule{0.277778em}{0ex}}E>$ where:

- N is the set of Nodes, which are partitioned into a set of: internal events ${N}_{intern}$, input ports ${N}_{in}$ and output ports ${N}_{out}$; $N=\{{N}_{intern},\phantom{\rule{0.277778em}{0ex}}{N}_{in},\phantom{\rule{0.277778em}{0ex}}{N}_{out}\}$. For instance, for the CDFT model depicted in Figure 6, considering C
_{1}: ${N}_{intern}=\{{C}_{1}.B{E}_{1},\phantom{\rule{0.277778em}{0ex}}{C}_{1}.B{E}_{2}\}$, ${N}_{in}=\{{C}_{1}.i{n}_{1},\phantom{\rule{0.277778em}{0ex}}{C}_{1}.i{n}_{2},\phantom{\rule{0.277778em}{0ex}}{C}_{1}.i{n}_{3},\phantom{\rule{0.277778em}{0ex}}{C}_{1}.i{n}_{4}\}$, ${N}_{out}=\{{C}_{1}.Ou{t}_{1},\phantom{\rule{0.277778em}{0ex}}{C}_{1}.Ou{t}_{2}\}$. - G is the set of Gates, where each gate $g\in G$ is described by: one output port $g.out$; one or more input ports $g.i{n}_{i}\phantom{\rule{0.277778em}{0ex}}/i\in \mathbb{N}$; a dynamic function which links inputs with outputs according to static (AND, OR, KooN) and/or dynamic (PAND) Fault Tree gates. As displayed in Table 2, the behaviour of the CDFT gates are defined according to its input events (A, B), which can be extended to an arbitrary number of input events.
- $SC$ is a set of Sub-Components, where each subcomponent $sc\in SC$ is described by: one or more output ports $sc.ou{t}_{i}$; one or more input ports $sc.i{n}_{i}$; and a mapping to another CDFT component’s failure logic. For instance, for the CDFT model depicted in Figure 6, SC=C
_{2}: ${N}_{in}=\{{C}_{2}.i{n}_{1},\phantom{\rule{0.277778em}{0ex}}{C}_{2}.i{n}_{2},\phantom{\rule{0.277778em}{0ex}}{C}_{2}.i{n}_{3}\}$, ${N}_{out}=\{{C}_{2}.Ou{t}_{1}\};$ mapping: ${C}_{1}.i{n}_{1}\to {C}_{2}.i{n}_{1};\text{}{C}_{1}.i{n}_{2}\to {C}_{2}.i{n}_{2};\text{}{C}_{1}.i{n}_{3}\to {C}_{2}.i{n}_{3};\text{}{C}_{2}.ou{t}_{1}\to OR.i{n}_{2};\text{}{C}_{2}.ou{t}_{1}\to AND.i{n}_{1}$. - E is a set of directed Edges $E\phantom{\rule{0.277778em}{0ex}}\subseteq \phantom{\rule{0.277778em}{0ex}}$ ((N
_{intern}∪ N_{in}∪ G.OUT ∪ SC.OUT) × (N_{out}∪ G.IN ∪ SC.IN)), where G.OUT is the set of all outputs of all gates; G.IN is the set of all inputs of all gates; SC.OUT is the set of all outputs of all sub-components; and SC.IN is the set of all inputs of all sub-components.

**BE**(parameters, distribution) generates the corresponding failure data of basic events. Note: ${C}_{2}.Ou{t}_{1}$ is simplified to ${C}_{2}$ in the previous equations because ${C}_{2}$ has a single output. For clarity and conciseness, in the remainder of the paper we will use the CDFT equations to express the failure logic of systems, instead of the graphical representation of CDFTs.

**pPAND**(d, A, B), where d is the time distance between events A and B. Y is true only if A fails before B and B fails within d time units after A.

#### 3.4.2. Dependability Evaluation Modelling Approach: Concepts and Notation

^{th}implementation ([SF].[Imp

_{i}] Failure) comprised of N resources is defined as:

#### 3.4.3. Dependability Evaluation Modelling Approach: Analysis Algorithm

_{SF}implementations of the subfunction, the ${\mathcal{F}}_{\mathrm{All}\phantom{\rule{0.277778em}{0ex}}\text{Imp1.}}$ event happens when each implementation fails or is detected as failed:

^{th}implementation of the subfunction fails and the reconfiguration has failed but after successfully reconfiguring the previous i-1 implementations (reconfiguration sequence failure, ${\mathcal{F}}_{\text{R}\phantom{\rule{0.277778em}{0ex}}\mathrm{Seq}{.}_{\mathit{i}}}$). Assuming ${\mathcal{F}}_{{\mathrm{SF}}_{1..\mathit{i}-1}\phantom{\rule{0.277778em}{0ex}}\mathrm{FP}}=\mathbf{AND}({\mathcal{F}}_{{\mathrm{SF}}_{1}\phantom{\rule{0.277778em}{0ex}}\mathrm{FP}},\dots ,{\mathcal{F}}_{{\mathrm{SF}}_{\mathit{i}-1}\phantom{\rule{0.277778em}{0ex}}\mathrm{FP}})$ indicates the failure or false positive from 1 to i-1 implementations:

^{th}implementation of the subfunction fails and the fault detection of the subfunction has failed but after detecting correctly previous i-1 implementation failures (fault detection sequence failure, ${\mathcal{F}}_{\mathrm{FD}\phantom{\rule{0.277778em}{0ex}}\mathrm{Seq}{.}_{\mathit{i}}}$). Note that fault detection’s false positive and omission failures are mutually exclusive:

^{th}implementation’s failure unresolved event (${\mathcal{F}}_{\mathrm{Unr}.\phantom{\rule{0.277778em}{0ex}}{\mathrm{Imp}}_{\mathit{i}}}$) occurs when either the fault detection sequence (${\mathcal{F}}_{\mathrm{FD}\phantom{\rule{0.277778em}{0ex}}\mathrm{Seq}{.}_{\mathit{i}}}$) fails or the reconfiguration sequence (${\mathcal{F}}_{\text{R}\phantom{\rule{0.277778em}{0ex}}\mathrm{Seq}{.}_{\mathit{i}}}$) fails:

_{Dest}’s implementation determines its reconfiguration. We assume that the change of destination subfunction’s implementation activates the corresponding fault detection implementation and the previous one is deactivated. Equation (15) describes the FD_SF failure case when FD_SF has K implementations:

^{th}fault detection implementation (${\mathcal{F}}_{{\text{FD\_Dest}}_{\mathit{i}}}$), it expresses the following event: from 1 to i-1 implementations of the destination SF fail and reconfigure correctly (${\mathcal{F}}_{{\text{SF\_Dest}}_{1..\mathit{i}-1}}$), and then either the i

^{th}fault detection occurs or the implementation of the destination subfunction fails:

#### 3.4.4. Dependability Evaluation Modelling Approach: Uncertainty Analysis

- Monte Carlo sampling of the uncertain variables: from the failure rates of the uncertain variables, a single failure rate value is chosen randomly within the specified failure rate interval according to the uniform distribution. A randomly sampled failure rate is the outcome of this activity.
- Monte Carlo sampling of the time to failure of uncertain variables and known variables from their cumulative distribution function. A set of randomly sampled time to failure instants are the outcome of this process.
- With the updated values, the CDFT model is solved extracting counters of top-event failure occurrences and critical event failure occurrences.
- After N Monte Carlo trials, the CDFT model’s statistical results are gathered in a histogram which illustrates and classifies the frequency of occurrence of the top event.
- After M Monte Carlo trials, the process ends and the histogram is normalized.

## 4. D3H2 Application: Train Car Door Status Control

_{Driver}) and each door throughout the train has: one opening button for passengers, one door speed sensor, one door open detection sensor, one door closed detection sensor and one obstacle detection sensor. All these sensors, their controllers, and the door control algorithm are located in the PU

_{1}.

_{TCMS}) for safety purposes. The TCMS receives information about the speed of the train and it will not allow the driver to open the doors while the train is running. To this end, the TCMS sends an enable signal to the driver to inform the driver about the safe operation of door opening or closing (Enable Door Driver—EDD). Using the information of the Enable Door Driver signal, the driver sends an enable signal to the controller of each door (Enable Door Passenger—EDP) to act safely on opening/closing the doors, while taking into account if the train is moving and whether there is an obstacle in the door.

_{1}.Zone

_{A}.Door (Figure 4).

_{Cam}. For clarity, only relevant information of the Video Surveillance main function is shown in Figure 8a.

_{Cam}of the Video Surveillance main function.

#### 4.1. Redundancy Strategies

#### 4.2. Reconfiguration Strategies

_{i}refers to the i

^{th}implementation of the subfunction; 1R, 2R and 3R identify the number of reconfiguration implementations; and C and D letters designate centralised and distributed configurations, respectively.

_{SW_HM}): SW_FD, SW_R and SW_FD_R. The failure rates of these components have been modified altogether to highlight the influence of reconfiguration implementations on system failure probability. Table 9 displays failure probability values of the DSC main function for alternative reconfiguration strategies with different failure rate values of health monitoring software components. For instance, the configuration 2R C is the same as the architecture described in Figure 8b (including heterogeneous redundancies for OD and DV in Table 4 and repeating the health monitoring configuration of the DOD subfunction for DCD, OD, and DV according to Table 7) and the configuration 2 of Figure 9 (PU

_{Cam}= PU

_{2}).

_{SW_HM}and number of reconfiguration redundancies, the better the failure probability of distributed reconfigurations. The failure probability of centralised reconfigurations confirms that the introduction of additional components increase system failure sources. However, with the increase of the failure rate values and reconfiguration’s redundancies, the system’s common cause failures gain importance, and distributed implementations perform better than configurations with system bottlenecks. Interestingly, there is a “threshold” failure rate, beyond which the distribution of reconfiguration strategies has no impact on the failure probability of the system. The “threshold” failure rate decreases as the number of reconfiguration’s redundancy implementations increases (see grey cells in Table 9). This should be studied further, but it seems reasonable that the higher the failure probability of the reconfiguration implementations, the impact of the reconfiguration strategies becomes less important.

#### 4.3. Health Management Mechanisms and Communication Influences

_{s}. Table 10 displays FCI values of fault detection (${\mathcal{FCI}}_{{\mathcal{F}}_{\mathit{FD\_SF}}}$) and reconfiguration subfunctions (${\mathcal{FCI}}_{{\mathcal{F}}_{\mathit{R\_SF}}}$) for different Redundancy Strategies (RS).

## 5. Conclusions

- (a)
- The formal identification and categorisation of heterogeneous redundancies for complex systems is a challenging task. The lack of deterministic relations between some of the variables hampers the formalisation process. Possible solutions to address these issues can include formalisation of engineering design knowledge through meta-modelling techniques (e.g., [42]) or formal analysis of highly networked scenarios through equation-based modelling formalisms (e.g., [43]).
- (b)
- We have not included downtime costs arising from repair activities, which leads to high financial penalties due to the immobilization of trains on stations or tracks. For our future goals, we plan to perform the following activities: (1) introduce repair concepts to evaluate availability and downtime costs; (2) automate the architecture optimisation extracting the combination of homogeneous and heterogeneous redundancies, which maximizes dependability and minimizes cost, e.g., by extending work on metaheuristics in [14]; and (3) weigh the degradation of the functionality considering other factors than component failure rates.
- (c)
- While for non-repairable systems only the order of failure is important, for repairable systems, both the order of failure and repair must be respected. In D3H2, the reconfiguration process is governed by the reconfiguration priority of implementations. This means that the reconfiguration process is not necessarily a sequential process, but it can follow a random process. Therefore, the predefined sequential logic of repairable Dynamic Fault Tree gates [24] is invalid for repairable systems. More powerful and flexible stochastic formalisms are needed to address these properties (e.g., [44]).
- (d)
- Finally, with the dependability evaluation model presented in this paper, there is potential for automation and optimization via use of metaheuristics. Fitness functions can include dependability and cost while parameters to be altered may include the location, type, and level of redundancies and health monitoring mechanisms. Metaheuristics can focus on choosing solutions for these variables that could optimize the trade-off between dependability and cost.

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## Abbreviations/Nomenclature

CDFT | Component Dynamic Fault Tree |

FD_R_SF | Fault Detection of the R_SF |

DCA | Door Control Algorithm |

FD_SF | Fault Detection of the SF |

DCD | Door Closed Detection |

FP | False Positive |

DM | Door Manipulation |

MF | Main Function |

DOC | Door Open Command |

O | Omission |

DOD | Door Open Detection |

OD | Obstacle Detection |

DSC | Door Status Control |

PL | Physical Location |

DV | Door Velocity |

PU | Processing Unit |

EDD | Enable Door Driver |

R | Reconfiguration |

EDP | Enable Door Passenger |

R_SF | Reconfiguration of the SF |

FCI | Failure Criticality Index |

SF | Subfunction |

FD | Fault Detection |

TCMS | Train Control and Monitoring System |

## References

- Elegbede, A.; Chu, C.; Adjallah, K.; Yalaoui, F. Reliability allocation through cost minimization. IEEE Trans. Reliab.
**2003**, 52, 106–111. [Google Scholar] [CrossRef] - Avizienis, A. The N-Version Approach to Fault-Tolerant Software. IEEE Trans. Softw. Eng.
**1985**, SE-11, 1491–1501. [Google Scholar] [CrossRef] - Aizpurua, J.I.; Muxika, E. Functionality and Dependability Assurance in Massively Networked Scenarios. In Safety, Reliability and Risk Analysis: Beyond the Horizon; CRC Press: Boca Raton, FL, USA, 2013; pp. 1763–1771. [Google Scholar]
- Aizpurua, J.I.; Muxika, E.; Manno, G.; Chiacchio, F. Heterogeneous Redundancy Analysis based on Component Dynamic Fault Trees. In Proceedings of PSAM 12, Honolulu, HI, USA, 22–27 June 2014.
- Avizienis, A.; Laprie, J.C.; Randell, B.; Landwehr, C. Basic Concepts and Taxonomy of Dependable and Secure Computing. IEEE Trans. Dependable Secur. Comput.
**2004**, 1, 11–33. [Google Scholar] [CrossRef] - Shelton, C.P.; Koopman, P. Improving System Dependability with Functional Alternatives. In Proceedings of the Int. Conf. on Dependable Systems and Networks (DSN), Florence, Italy, 28 June–1 July 2004; pp. 295–304.
- Wysocki, J.; Debouk, R. Methodology for Assessing Safety-critical Systems. Int. J. Model. Simul.
**2007**, 27, 99–106. [Google Scholar] [CrossRef] - Adler, R.; Schneider, D.; Trapp, M. Engineering dynamic adaptation for achieving cost-efficient resilience in software-intensive embedded systems. In Proceedings of the Engineering of Complex Computer Systems, Oxford, UK, 22–26 March 2010; pp. 21–30.
- Aizpurua, J.I.; Muxika, E. Model Based Design of Dependable Systems: Limitations and Evolution of Analysis and Verification Approaches. Int. J. Adv. Secur.
**2013**, 6, 12–31. [Google Scholar] - Strigini, L. Fault Tolerance Against Design Faults. In Dependable Computing Systems: Paradigms, Performance Issues, and Applications; Diab, H., Zomaya, A., Eds.; John Wiley & Sons: New York, NY, USA, 2005; pp. 213–241. [Google Scholar]
- Blanke, M.; Hansen, S.; Blas, M.R. Diagnosis for Control and Decision Support in Complex Systems. In Proceedings of Synergy of Control, Communications and Computing–COSY, Ohrid, Macedonia, 16–20 September 2011; pp. 89–101.
- Cauffriez, L.; Renaux, D.; Bonte, T.; Cocquebert, E. Systemic Modeling of Integrated Systems for Decision Making Early on in the Design Process. Cybern. Syst.
**2013**, 44, 1–22. [Google Scholar] [CrossRef] - Perez, D.; Mirandola, R.; Merseguer, J. On the Relationships between QoS and Software Adaptability at the Architectural Level. J. Syst. Softw.
**2014**, 87, 1–17. [Google Scholar] - Adachi, M.; Papadopoulos, Y.; Sharvia, S.; Parker, D.; Tohdo, T. An approach to optimization of fault tolerant architectures using HiP-HOPS. Softw. Pract. Exp.
**2011**, 41, 1303–1327. [Google Scholar] [CrossRef] - Katsaros, P.; Angelis, L.; Lazos, C. Performance and effectiveness trade-off for checkpointing in fault-tolerant distributed systems. Concurr. Comput. Pract. Exp.
**2007**, 19, 37–63. [Google Scholar] [CrossRef] - Chen, D.; Lönn, H.; Mraidha, C.; Papadopoulos, Y.; Reiser, M.; Servat, D.; Azevedo, L.S.; Piergiovanni, S.T.; Walker, M. Automatic Optimisation of System Architectures using EAST-ADL. In Proceedings of the SAFECOMP 2013—Workshop ASCoMS (Architecting Safety in Collaborative Mobile Systems), Toulouse, France, 24–27 September 2013.
- Marca, D.A.; McGowan, C.L. SADT: Structured Analysis and Design Technique; McGraw-Hill, Inc.: New York, NY, USA, 1987. [Google Scholar]
- Aizpurua, J.I. Functionality and Dependability Assurance in Massively Networked Scenarios. Ph.D. Thesis, Electronics and Computing Department, Mondragon University, Basque Country, Spain, January 2015. [Google Scholar]
- Asim, M.; Zhou, B.; Llewellyn-Jones, D.; Shi, Q.; Merabti, M. Dynamic Monitoring of Composed Services. In Cyberpatterns; Blackwell, C., Zhu, H., Eds.; Springer: Berlin, Germany, 2014; pp. 235–245. [Google Scholar]
- Kaiser, B.; Liggesmeyer, P.; Mäckel, O. A New Component Concept for Fault Trees. In Proceedings of the Safety Critical Systems & Software (SCS), Canberra, Australia, 9–10 October 2003; pp. 37–46.
- Papadopoulos, Y.; Walker, M.; Parker, D.; Rüde, E.; Hamann, R.; Uhlig, A.; Grätz, U.; Lien, R. Engineering failure analysis and design optimisation with HiP-HOPS. Eng. Failure Anal.
**2011**, 18, 590–608. [Google Scholar] [CrossRef] - Dugan, J.; Bavuso, S.; Boyd, M. Dynamic fault-tree models for fault-tolerant computer systems. IEEE Trans. Reliab.
**1992**, 41, 363–377. [Google Scholar] [CrossRef] - Montani, S.; Portinale, L.; Bobbio, A.; Codetta-Raiteri, D. Radyban: A tool for reliability analysis of dynamic fault trees through conversion into dynamic Bayesian networks. Reliab. Eng. Syst. Saf.
**2008**, 93, 922–932. [Google Scholar] [CrossRef] - Manno, G.; Chiacchio, F.; Compagno, L.; D’Urso, D.; Trapani, N. Conception of Repairable Dynamic Fault Trees and resolution by the use of RAATSS, a Matlab toolbox based on the ATS formalism. Reliab. Eng. Syst. Saf.
**2014**, 121, 250–262. [Google Scholar] [CrossRef] - Codetta-Raiteri, D. The Conversion of Dynamic Fault Trees to Stochastic Petri Nets, as a case of Graph Transformation. Electron. Notes Theor. Comput. Sci.
**2005**, 127, 45–60. [Google Scholar] [CrossRef] - Bouissou, M.; Bon, J.L. A new formalism that combines advantages of fault-trees and Markov models: Boolean logic driven Markov processes. Reliab. Eng. Syst. Saf.
**2003**, 82, 149–163. [Google Scholar] [CrossRef] - Kaiser, B.; Gramlich, C.; Forster, M. State-Event Fault Trees - A Safety Analysis Model for Software-Controlled Systems. Reliab. Eng. Syst. Saf.
**2007**, 92, 1521–1537. [Google Scholar] [CrossRef] - Raiteri, D.C. Integrating several formalisms in order to increase Fault Trees’ modeling power. Reliab. Eng. Syst. Saf.
**2011**, 96, 534–544. [Google Scholar] [CrossRef] - Zio, E. The Monte Carlo Simulation Method for System Reliability and Risk Analysis; Springer: Berlin, Germany, 2013. [Google Scholar]
- Edifor, E.; Walker, M.; Gordon, N. Quantification of Simultaneous-AND Gates in Temporal Fault Trees. In New Results in Dependability and Computer Systems; Springer: Berlin, Germany, 2013; Volume 224, pp. 141–151. [Google Scholar]
- Littlewood, B.; Strigini, L. Software Reliability and Dependability: A Roadmap. In Proceedings of the Conference on The Future of Software Engineering, Limerick, Ireland, 4–11 June 2000; pp. 175–188.
- Goseva-Popstojanova, K.; Trivedi, K.S. Architecture-based approach to reliability assessment of software systems. Perform. Eval.
**2001**, 45, 179–204. [Google Scholar] [CrossRef] - Lyu, M.R. Software Reliability Engineering: A Roadmap. In Proceedings of the Future of Software Engineering, 2007 (FOSE ’07), Minneapolis, MN, USA, 23–25 May 2007; pp. 153–170.
- Forster, M.; Trapp, M. Fault Tree Analysis of Software-Controlled Component Systems Based on Second-Order Probabilities. In Proceedings of the ISSRE’09, Mysuru, Karnataka, 16–19 November 2009; pp. 146–154.
- Manno, G.; Chiacchio, F.; Compagno, L.; D’Urso, D.; Trapani, N. MatCarloRe: An integrated FT and Monte Carlo Simulink tool for the reliability assessment of dynamic fault tree. Expert Syst. Appl.
**2012**, 39, 10334–10342. [Google Scholar] [CrossRef] - Meedeniya, I.; Moser, I.; Aleti, A.; Grunske, L. Architecture-based Reliability Evaluation Under Uncertainty. In Proceedings of QoSA-ISARCS ’11, Boulder, CO, USA, 20–24 June 2011; pp. 85–94.
- Kanoun, K. Real-world design diversity: A case study on cost. IEEE Softw.
**2001**, 18, 29–33. [Google Scholar] [CrossRef] - IAEA. Component Reliability Data for Use In Probabilistic Safety Assessment; IAEA-TECDOC-478; Technical Report for IAEA: Vienna, Austria, 1988. [Google Scholar]
- JVC Professional. Available online: http://pro.jvc.com/ (accessed on 5 August 2015).
- Vinod, G.; Santosh, T.; Saraf, R.; Ghosh, A. Integrating Safety Critical Software System in Probabilistic Safety Assessment. Nuclear Eng. Des.
**2008**, 238, 2392–2399. [Google Scholar] [CrossRef] - Wang, W.; Loman, J.; Vassiliou, P. Reliability importance of components in a complex system. In Proceedings of the 2004 Annual Symposium Reliability and Maintainability, Los Angeles, CA, USA, 26–29 January 2004; pp. 6–11.
- Henderson-Sellers, B. Bridging metamodels and ontologies in software engineering. J. Syst. Softw.
**2011**, 84, 301–313. [Google Scholar] [CrossRef] - Fritzson, P. Introduction to Modeling and Simulation of Technical and Physical Systems with Modelica; Wiley-IEEE Press: Hoboken/Piscataway, NJ, USA, 2011. [Google Scholar]
- Sanders, W.H.; Meyer, J.F. Stochastic Activity Networks: Formal Definitions and Concepts. In Lectures on Formal Methods and Performance Analysis; Springer: Berlin, Germany, 2001; Volume 2090, pp. 315–343. [Google Scholar]

**Figure 2.**D3H2 methodology [3]:

**(a)**Functional Modelling Approach;

**(b)**Compatibility Analysis;

**(c)**Extended Functional Modelling Approach;

**(d)**Dependability Evaluation Modelling Approach.

**Figure 3.**Functional Modelling Approach [18].

**Figure 6.**Component Dynamic Fault Tree example [4].

**Figure 12.**DSC failure probability distributions for different communication’s failure rate intervals.

Main Function | Physical Location | Subfunction | Resources | # |
---|---|---|---|---|

Fire Protection | Train. Car_{1}. Zone_{A} | Fire Detection | Fire Detector, PU_{FP} | 1 |

Fire Protection Control | Fire Detection, PU_{FP}, SW_{FP} | 2 | ||

Alarm | Fire Protection Control, PU_{FP}, Sprinkler | 3 | ||

Passenger Alarm System | Train. Car_{1}. Zone_{A} | Passenger Alarm | Emergency Button, PU_{PAS} | 4 |

Process Alarm | Passenger Alarm, PU_{PAS}, SW_{PAS} | 5 | ||

Alarm | Process Alarm, PU_{PAS}, Siren | 6 | ||

Passenger Info. System | Train. Car_{1}. Zone_{A} | Current Position | GPS, PU_{Driver} | 7 |

Process Information | Current Position, PU_{Driver}, SW_{PIS} | 8 | ||

Activate Message | Process Information, PU_{PIS}, Display, Comm | 9 | ||

Temperature Control | Train. Car_{1}. Zone_{A} | Temperature Measurement | Temperature Sensor_{A}, PU_{TC_A} | 10 |

... | ... | 11 | ||

Train.Car_{1}.Zone_{B} | Temperature Measurement | Temperature Sensor_{B}, PU_{TC_B} | 12 | |

... | ... | 13 |

Gate Notation | Gate Behaviour |
---|---|

Y=AND(A,B) | If A fails and B fails, then Y fails |

Y=OR(A,B) | If A fails or B fails, then Y fails |

Y=PAND(A,B) | If A fails before the failure of B or at the same time, then Y fails |

Y=NOT(A) | If A doesn’t fail, then Y fails |

Notation | Failure Logic | Notation | Failure/Working Logic |
---|---|---|---|

${\mathcal{F}}_{\text{X}}$ | X failure | ${\mathcal{W}}_{\text{X}}$ | X working |

${\mathcal{F}}_{\text{SF}}$ | [SF] failure | ${\mathcal{W}}_{{\text{SF}}_{i}}$ | [SF].[Impl_{i}] working = $\mathbf{NOT}\left({\mathcal{F}}_{{\mathrm{SF}}_{\mathit{i}}}\right)$ |

${\mathcal{F}}_{{\text{SF}}_{i}}$ | [SF].[Impl_{i}] failure | ${\mathcal{F}}_{\text{R}}$ | [R_SF] failure |

${\mathcal{F}}_{\text{FD}}$ | [FD_SF] failure | ${\mathcal{F}}_{{\text{R}}_{i}\text{O}}$ | [R_SF].[Impl_{i}] omission |

${\mathcal{F}}_{\text{FD FP}}$ | [FD_SF] false positive | ${\mathcal{F}}_{{\text{FD\_R}}_{i}\text{FP}}$ | [FD__{{[R_SF].[Impli]}}] false positive |

${\mathcal{F}}_{{\text{FD}}_{i}}$ | [FD_SF].[Impl_{i}] failure | ${\mathcal{F}}_{{\text{FD\_R}}_{i}}$ | [FD__{{[R_SF].[Impli]}}] failure |

${\mathcal{F}}_{{\text{FD}}_{i}\text{O}}$ | [FD_SF].[Impl_{i}] omission | ${\mathcal{F}}_{{\text{R}}_{i}\text{O/FP}}$ | [R_SF].[Impl_{i}] omission or FP = $\mathbf{OR}({\mathcal{F}}_{{\text{R}}_{\mathit{i}}\text{O}},{\mathcal{F}}_{{\text{FD\_R}}_{\mathit{i}}\mathrm{FP}})$ |

${\mathcal{F}}_{{\text{SF}}_{i}\text{FP}}$ | [SF].[Impl_{i}] failure or FP = $\mathbf{OR}({\mathcal{F}}_{{\mathrm{SF}}_{\mathit{i}}},{\mathcal{F}}_{\text{FD FP}})$ |

Subfunction | Nominal implementation | Heterogeneous implementation | Homogeneous implementation |
---|---|---|---|

Door Open Detection (DOD) | PU_{1}, OpenSensor | Camera, PU_{Cam}, SW_{OpenDet}, Comm | PU_{1}, OpenSensor2 |

Door Open Detection (DCD) | PU_{1}, ClosedSensor | Camera, PU_{Cam}, SW_{CloseDet}, Comm | PU_{1}, ClosedSensor2 |

Obstacle Detection (OD) | PU_{1}, ObstacleSensor | Camera, PU_{Cam}, SW_{ObstDet}, Comm | PU_{1}, ObstacleSensor2 |

Door Velocity (DV) | PU_{1}, VelocitySensor | Camera, PU_{Cam}, SW_{Speed}, Comm | PU_{1}, VelocitySensor2 |

Component | λ (year^{−1}) | Cost (€) |
---|---|---|

SW_Det, SW_HM | 1E^{−2} | 80 |

Pressure Sensor [38] | 1.6E^{−2} | 20 |

Speed Sensor [38] | 1.8E^{−2} | 25 |

Camera [39] | 9.43E^{−2} | - |

PU [40] | 3.87E^{−2} | 30 |

Comm. & Gateway | 5E^{−3} | 200 |

ID | Configuration |
---|---|

#1 | No redundancies (Figure 8a) |

#2 | 4 heterogeneous redundancies |

#3 | 4 homogeneous redundancies |

#4 | 3 heterogeneous redundancies.: DCD, DOD, DV; 1 homogeneous redundancy: OD |

#5 | 2 heterogeneous redundancies.: DCD, DOD; 2 homogeneous redundancies: OD, DV |

#6 | 1 heterogeneous redundancy: DCD; 3 homogeneous redundancies: OD, DV, DOD |

Implementation | FD_SF | R_SF | FD_R_SF |
---|---|---|---|

Implementation 1 | PU_{1}, SW_{FD_SF}, Comm | PU_{1}, SW_{R_SF} | PU_{Cam}, SW_{FD_R_SF}, Comm |

Implementation 2 | No redundancy | PU_{Cam}, SW_{R_SF}, Comm | PU_{1}, SW_{FD_R_SF}, Comm |

**Table 8.**Relative failure probability and cost values in Figure 9 (T = 12 years).

ID | Relative Failure Probability | Relative Cost |
---|---|---|

#2 | 0.988 | 1.318 |

#3 | 0.946 | 1.393 |

#4 | 0.98 | 1.348 |

#5 | 0.969 | 1.383 |

#6 | 0.958 | 1.413 |

Configuration | Reconfiguration Implementation Distributions | DSC Failure Probability | ||
---|---|---|---|---|

λ_{SW_HM} = 0.05 | λ_{SW_HM} = 0.15 | λ_{SW_HM} = 0.25 | ||

1R C | PU(R_DOD_{1}_{1}, R_DCD_{1}, R_OD_{1}, R_DV_{1}) | 0.856 | 0.887 | 0.902 |

1R D | PU(R_DOD_{1}_{1}); PU(R_DCD_{2}_{1}); PU(R_OD_{3}_{1}); PU(R_DV_{4}_{1}) | 0.867 | 0.892 | 0.904 |

2R C | PU(R_DOD_{1}_{1}, R_DCD_{1},R_OD_{1}, R_DV_{1}); PU(R_DOD_{2}_{2}, R_DCD_{2}, R_OD_{2}, R_DV_{2}) | 0.850 | 0.888 | 0.905 |

2R D | PU(R_DOD_{1}_{1}, R_DCD_{2}); PU(R_DOD_{2}_{2}, R_DCD_{1}); PU(R_OD_{3}_{1}, R_DV_{2}); PU(R_OD_{4}_{2}, R_DV_{1}) | 0.853 | 0.888 | 0.905 |

3R C | PU(R_DOD_{1}_{1}, R_DCD_{1}, R_OD_{1}, R_DV_{1}); PU(R_DOD_{2}_{2}, R_DCD_{2}, R_OD_{2}, R_DV_{2}); PU(R_DOD_{3}_{3}, R_DCD_{3}, R_OD_{3}, R_DV_{3}) | 0.838 | 0.874 | 0.897 |

3R D | PU(R_DOD_{1}_{1},R_DCD_{2},R_OD_{3}); PU(R_DOD_{2}_{2},R_DCD_{1},R_DV_{3}); PU(R_DOD_{3}_{3},R_OD_{1},R_DV_{2}); PU(R_DCD_{4}_{3},R_OD_{2},R_DV_{1}) | 0.839 | 0.875 | 0.897 |

**Table 10.**${\mathcal{FCI}}_{{\mathcal{F}}_{FD\_SF}}$ and ${\mathcal{FCI}}_{{\mathcal{F}}_{R\_SF}}$ using different redundancy strategies.

RS | ${\mathcal{FCI}}_{{\mathcal{F}}_{\mathit{FD}\_\mathit{DOD}}}$ | ${\mathcal{FCI}}_{{\mathcal{F}}_{\mathit{R}\_\mathit{DOD}}}$ | ${\mathcal{FCI}}_{{\mathcal{F}}_{\mathit{FD}\_\mathit{DCD}}}$ | ${\mathcal{FCI}}_{{\mathcal{F}}_{\mathit{R}\_\mathit{DCD}}}$ | ${\mathcal{FCI}}_{{\mathcal{F}}_{\mathit{FD}\_\mathit{OD}}}$ | ${\mathcal{FCI}}_{{\mathcal{F}}_{\mathit{R}\_\mathit{OD}}}$ | ${\mathcal{FCI}}_{{\mathcal{F}}_{\mathit{FD}\_\mathit{DV}}}$ | ${\mathcal{FCI}}_{{\mathcal{F}}_{\mathit{R}\_\mathit{DV}}}$ |
---|---|---|---|---|---|---|---|---|

A | 0.1520 | 0.1367 | 0.1524 | 0.1374 | 0.1520 | 0.1372 | 0.1563 | 0.1416 |

B | 0.2265 | 0.1949 | 0.2267 | 0.1956 | 0.2265 | 0.1954 | 0.2362 | 0.1999 |

C | 0.1826 | 0.1623 | 0.1832 | 0.1632 | 0.1825 | 0.1627 | 0.1863 | 0.1674 |

**A**: 4 Homogeneous Redundancies connected to different explicitly added 4 PU

_{s}.

**B**: 4 Homogeneous Redundancies connected to the same existing PU

_{1}.

**C**: 4 Heterogeneous Redundancies connected to PU

_{1}and PU

_{Cam}.

Configuration | ${\mathcal{FCI}}_{{\mathcal{F}}_{\mathit{DCA}}}$ | ${\mathcal{FCI}}_{{\mathcal{F}}_{\mathit{DOD}}}$ | ${\mathcal{FCI}}_{{\mathcal{F}}_{\mathit{FD}\_\mathit{DOD}\phantom{\rule{0.277778em}{0ex}}\mathit{Seq}.}}$ | ${\mathcal{FCI}}_{{\mathcal{F}}_{\mathit{R}\_\mathit{DOD}\phantom{\rule{0.277778em}{0ex}}\mathit{Seq}.}}$ |
---|---|---|---|---|

Ideal: FD, R, Comm | 0.9222 | 0.0953 | 0 | 0 |

Ideal: Comm, FD | 0.9221 | 0.1016 | 0 | 0.0522 |

Ideal: FD, R | 0.9236 | 0.0931 | 0 | 0 |

Ideal: FD | 0.9237 | 0.0994 | 0 | 0.0542 |

Ideal: Comm, R | 0.9278 | 0.2123 | 0.1461 | 0 |

Ideal: Comm. | 0.9279 | 0.2119 | 0.1456 | 0.0798 |

Ideal: R | 0.9278 | 0.2121 | 0.146 | 0 |

Reference | 0.9291 | 0.2085 | 0.1456 | 0.0851 |

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons by Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Aizpurua, J.I.; Muxika, E.; Papadopoulos, Y.; Chiacchio, F.; Manno, G.
Application of the D3H2 Methodology for the Cost-Effective Design of Dependable Systems. *Safety* **2016**, *2*, 9.
https://doi.org/10.3390/safety2020009

**AMA Style**

Aizpurua JI, Muxika E, Papadopoulos Y, Chiacchio F, Manno G.
Application of the D3H2 Methodology for the Cost-Effective Design of Dependable Systems. *Safety*. 2016; 2(2):9.
https://doi.org/10.3390/safety2020009

**Chicago/Turabian Style**

Aizpurua, Jose Ignacio, Eñaut Muxika, Yiannis Papadopoulos, Ferdinando Chiacchio, and Gabriele Manno.
2016. "Application of the D3H2 Methodology for the Cost-Effective Design of Dependable Systems" *Safety* 2, no. 2: 9.
https://doi.org/10.3390/safety2020009