A Combinatorial Safety Analysis of Cruise Ship Diesel–Electric Propulsion Plant Blackout

Diesel–Electric Propulsion (DEP) has been widely used for the propulsion of various ship types including cruise ships. Considering the potential consequences of blackouts, especially on cruise ships, it is essential to design and operate the ships’ power plants for avoiding and preventing such events. This study aims at implementing a comprehensive safety analysis for a cruise ship Diesel– Electric Propulsion (DEP) plant focusing on blackout events. The Combinatorial Approach to Safety Analysis (CASA) method is used to develop Fault Trees considering the blackout as the top event, and subsequently estimate the blackout frequency as well as implement importance analysis. The derived results demonstrate that the overall blackout frequency is close to corresponding values reported in the pertinent literature as well as estimations based on available accident investigations. This study deduces that the blackout frequency depends on the number of operating Diesel Generator (DG) sets, the DG set’s loading profile, the amount of electrical load that can be tripped during overload conditions and the plant operation phase. In addition, failures of the engine auxiliary systems and the fast-electrical load reduction functions, as well as the power generation control components, are identified as important. This study demonstrates the applicability of the CASA method to complex marine systems and reveals the parameters influencing the investigated system blackout frequency, thus providing better insights for these systems’ safety analysis and enhancement.


Introduction
The ship propulsion and electric power generating functions of modern cruise ships are realised using the Diesel-Electric Propulsion (DEP) plants [1][2][3]. In such cases, loss of electric power (blackout) during the ship sailing or manoeuvring may result in a number of accidents such as collision, contact and grounding, which, in turn, may cause considerable human losses of passengers and crew [4] also associated with severe environmental and reputational loss consequences. As the cruise ship industry has been rapidly developing in the last decade, with both the vessels' size and the number constantly growing [5], ensuring that blackouts do not occur is a paramount necessity.
The ships' DEP plants are classified as complex marine Cyber-Physical Systems (CPSs) [29] and, thus, their software-intensive character and dynamic reconfiguration functions need to be considered in the safety analysis/assessment [30]. According to previous accident investigations, the control and automation system faults are important contributors to blackouts in ships [31,32]. Thus, it is essential to quantitatively assess the DEP system's safety performance taking into account the employed software-based functions [33][34][35], as well as to estimate their importance metrics to allow for a cost-efficient safety enhancement [36,37].
In this respect, the present study aims to: (a) estimate the blackout frequency for the investigated cruise ship DEP system for various operational phases; (b) carry out an importance analysis to identify the critical components, and; (c) demonstrate the CASA applicability to a complex system. The classical safety analysis method's deficiencies are addressed by the CASA method, which: (a) identifies Unsafe Control actions as it encapsulates the STPA steps (thus more effectively capturing the Cyber-Physical System's (CPS) software-intensive character); (b) considers the sequences of the potential safety events by employing event sequence analysis, and; (c) provides quantification of the frequency (or probability) of the safety-related events by employing quantitative FTA.
The original contribution and novelty of this study includes: (a) the quantitative estimation of the blackout frequency for a cruise ship DEP plant and the associated importance analyses in a number of operation phases; (b) blackout frequency estimation with varying design and operational parameters such as varying Maximum Continuous Rating and the amount of tripped load; (c) a number of adaptations used with the CASA method to apply it to the investigated DEP system.

System Description
The simplified single line diagram and a system control structure diagram are provided in Figures 1 and 2, respectively. Design data were retrieved from the operating and maintenance manuals of the system components, the associated system drawings and relevant literature [2,[38][39][40][41][42][43] and are provided in Table 1. The engines (of the DG sets) starting up is based on the ship's electrical load demand, whereas the engine switchover is also implemented based on the DG set's running hours. The system is capable of implementing fast propulsion motors electrical load reduction and preferential tripping functions (fast load reduction). This is realised by tripping heavy energy consumers including the electric motors of the Air Conditioning system compressors (AC). The system generator sets can normally accept a load up to 90% of their nominal The engines (of the DG sets) starting up is based on the ship's electrical load demand, whereas the engine switchover is also implemented based on the DG set's running hours. The system is capable of implementing fast propulsion motors electrical load reduction and preferential tripping functions (fast load reduction). This is realised by tripping heavy energy consumers including the electric motors of the Air Conditioning system compressors (AC). The system generator sets can normally accept a load up to 90% of their nominal power. The total power demand is evenly shared among the operating generator sets (proportionally to each generator-set nominal power). Prewarning alarms can allow a DG set to switch over to a healthy available DG set, when a lubrication oil low-pressure alarm, high exhaust gas temperature alarm and high cooling water temperature alarm are present in each operating DG set. As an optional function, the Intelligent DG set diagnosis can be used. Intelligent DG set diagnosis allows for tripping a faulty DG set in the case where a failure is present in the governor and Automatic Voltage Regulator (AVR) subsystems, leading to load imbalance and subsequent blackout [41]. power. The total power demand is evenly shared among the operating generator sets (proportionally to each generator-set nominal power). Prewarning alarms can allow a DG set to switch over to a healthy available DG set, when a lubrication oil low-pressure alarm, high exhaust gas temperature alarm and high cooling water temperature alarm are present in each operating DG set. As an optional function, the Intelligent DG set diagnosis can be used. Intelligent DG set diagnosis allows for tripping a faulty DG set in the case where a failure is present in the governor and Automatic Voltage Regulator (AVR) subsystems, leading to load imbalance and subsequent blackout [41].

Case Studies Selection
Based on the system description, the following case studies are selected: • Varying the prewarning alarms' effectiveness from 0% to 50% and 100% to assess the importance of this function.

•
Varying the DG set loading, as it is also expected to affect the potential DG overload conditions [39]. • Investigation of the maintenance intervals and periodicity impact on the blackout, as it is widely acknowledged that maintenance intervals and periodicity affect the system's safety.

•
Investigation of the intelligent diagnosis impact on the blackout frequency/failure rate (intelligent diagnosis is a novel concept [41], allowing the identification of the DG set that contributes to the network instability and its switching off). • Investigation of the system's susceptibility to blackout with varying the number of operating DG sets. • Investigation of the DEP system's susceptibility to blackout and important failures in various operating phases of the cruise ship including sailing in the open sea, manoeuvring close to harbours and in the harbour phase (at berth).
To facilitate the comparative assessment of the investigated case studies and the calculated frequency of blackout (FOB) verification against results from the pertinent literature, a general operation phase was considered that aggregated the analysed operation phases. The considered case studies for the investigated system are summarised in Table 2.

Employed Methodological Approach
The methodology followed in this study consists of three phases, as illustrated in the flowchart shown in Figure 3. The CASA method was adapted to the needs of the case studies as described below. During the first phase, the investigated systems' Fault Trees is developed considering the blackout as the top event. During the second phase, the top event failure is estimated for the considered case studies. During the third phase, the importance measures are calculated.

Phase 1 Fault Tree Development
During the first phase, the Fault Tree considering the blackout as the top event is developed by employing the CASA method. The CASA method and its steps' detailed description is provided in [33]; therefore, only a short description is provided in the present study. Initially, CASA follows the steps of the STPA approach (Leveson, 2011), and the hazards or sub hazards, Unsafe Control Actions (UCAs) with their causal factors based on the hierarchical control structure are identified. Then, each sub hazard/hazard is used as an initiating event and the propagation of sub hazards into other hazards or sub hazards is examined by considering the interactions between the system components, the presence of protective barriers and the combinatory faults using Event Sequence Identification (ESI). The previous step's results are synthesised into a single Fault Tree (FT), which effectively integrates the results of STPA and ESI analysis. In the last step of Fault Tree development, some events of the Fault Tree are further analysed by using FTA. This is implemented for the reference system as well as for the system with the intelligent diagnosis considering the total number of connected DG sets. Hence, 12 Fault Trees are developed in total. Six Fault Trees are developed for the reference system; each Fault Tree corresponds to the cases where one to six DG set(s) operate(s) (simultaneously for the case of multiple DG sets). Likewise, six FTs are developed for the investigated DEP system with intelligent diagnosis functionality. Although the developed Fault Trees are similar, some of the connections and gates are different, depicting each investigated system's characteristics.

Phase 1 Fault Tree Development
During the first phase, the Fault Tree considering the blackout as the top event is developed by employing the CASA method. The CASA method and its steps' detailed description is provided in [33]; therefore, only a short description is provided in the present study. Initially, CASA follows the steps of the STPA approach (Leveson, 2011), and the hazards or sub hazards, Unsafe Control Actions (UCAs) with their causal factors based on the hierarchical control structure are identified. Then, each sub hazard/hazard is used as an initiating event and the propagation of sub hazards into other hazards or sub hazards is examined by considering the interactions between the system components, the presence of protective barriers and the combinatory faults using Event Sequence Identification (ESI). The previous step's results are synthesised into a single Fault Tree (FT), which effectively integrates the results of STPA and ESI analysis. In the last step of Fault Tree development, some events of the Fault Tree are further analysed by using FTA. This is implemented for the reference system as well as for the system with the intelligent diagnosis considering the total number of connected DG sets. Hence, 12 Fault Trees are developed in total. Six Fault Trees are developed for the reference system; each Fault Tree corresponds to the cases where one to six DG set(s) operate(s) (simultaneously for the case of multiple DG sets). Likewise, six FTs are developed for the investigated DEP system with intelligent diagnosis

Phase 2 Top Event Frequency Estimation
During the second phase (Figure 3), the top event failure rate is estimated and the results are compared with respective results from generic accidental data. For the investigated system, two adaptations were made. The first adaptation includes the estimation of failure rate in different operation phases according to the following equation: where λ B denotes the system blackout failure rate in a specific phase, p denotes the total number of operating DGs, OP p denotes the frequency of operation with specific DGs total number (from one to six DGs simultaneously operating in the investigated DEP system), and λ p,B is the blackout failure with the specific total number of operating DGs. The employed assumption for deriving this equation is that the probabilities of the blackout are independent for each considered system configuration and operation phase.
The blackout frequency f B is calculated by employing the following equation as a function of the specific operational time (OT) and the blackout failure rate (λ B ) [45]: For each operation phase, the f B is compared with f B available from accident investigation data. This is required to ensure the consistency of the derived results with existing statistical data for a number of power plants.
The second adaptation accounts for the operating components with preventative maintenance. The average failure rate between two inspection maintenance periods was estimated by considering Weibull distributions for the components' failure rates. The use of Weibull was required to account for the inspection intervals, so that the maintenance intervals can be properly captured. The additional equations that were used for estimating the considered basic events probability (p x y,z ) are provided in Table 3. The required input parameters include the number of the redundant components r, the components' maintenance and testing intervals (T i ), the maintenance repair rates (µ i ), the components' failure rates (λ i ), and the beta factor of the Weibull distribution (β i ). The probability for DG set overload conditions in cases of a single or multiple DG sets failure was estimated using the equations derived by [39]. Table 3. Additional equations.
Other components with preventative maintenance

Operating components
Parts with preventive maintenance where a single component failure out of r identical will lead to event occurrence (based on [45]) Parts with preventive maintenance where all the r identical components must fail for event occurrence (based on [45]) Safety systems Unavailability due to periodical maintenance of standby equipment where r standby equipment are involved (based on [45])

Phase 3 Importance Measures Estimation
During the third phase, the importance measures were employed. The only adaptation is that instead of top event failure rate (λ p,B ), the blackout failure rates for each operation phase (λ B ) are used. The importance measures results (I FV j ) are compared with available statistical data. As the I FV j metric is used to identify the top event most probable cause, I FV j can be compared with available data from accident investigation reports [31] by aggregating the I FV j values for the different failure categories leading to a blackout, with this quantifying the overall contribution of each category (I FV j OM ). The safety recommendations are primarily generated based on importance measures. However, results from other phases, the generated Fault Trees structures and observations, as well as the estimated blackout frequency in the investigated operating phases are also used to derive appropriate safety recommendations.

Overview
Five types of input parameters are used, namely: (a) design data including the system layout, the system functions, the number and type of involved components, the control structure, and the maximum loads for some of the components (presented in Section 2.1); (b) the operating data for the system and its components; (c) the maintenance and inspection intervals for some of the components; (d) maintenance duration for some of the components; (e) the components' failure rates and β i factors; (f) assumptions for system functionalities. The used input parameters along with the associated sources are further analysed in the next paragraphs.

Operating Data
Based on the investigated cruise ship's actual operating data, which were collected for a period of 46 months, the frequency (time percentages) of each operation phase and the specific system configuration (the latter also considers the operating Propulsion Motors (PM) and Bow Thrusters (BT)) were estimated and presented in Table 4. These data have been aggregated by an automatic monitoring system, which provides the electric energy (in kWh) of the DG sets, the azipods and bow thrusters every 30 min over the above mentioned period. The general phase shown in Table 4 represents the overall, averaged plant operation and is practically a combination of other operation phases. Based on the available data, the probability density functions for the DG sets' load were estimated. From the operational data, the following observations were made: (a) a request to connect an additional DG set with the ship's electric network is implemented every 10 h; (b) switching over between DG sets is implemented every 20 h; (c) the change from the harbour phase to the manoeuvring phase is implemented every 40 h and vice versa.

Maintenance Inspection Intervals and Maintenance Duration
The maintenance inspection intervals were retrieved from the manufacturers' maintenance manuals, whilst the maintenance duration was estimated based on the data provided in [8,46,47], the OREDA database [48] and the actual operational data. For the safety functions sensors, it was assumed that their maintenance duration is equal to one hour, whilst the hardware and communication lines' maintenance duration was assumed to be 20 h.

Failure Rates for Components
Several sources were used to estimate the failure rates for the components of the investigated cruise ship DEP system. These included: (a) the OREDA database [48][49][50]; (b) the pertinent literature (as reported in [25] and the Supplementary Material of this study), and; (c) previous blackout events' investigation reports (available by a cruise ship operator). The accident investigation reports and the Protection and Indemnity (P&I) insurance [31] club results were used for a high-level comparison of the criticality assessment results with the results calculated for the investigated system. The failure rates of the system's functions that use software were estimated from the data provided in [51,52]. The β i values for components with preventive maintenance were retrieved from a number of publications listed in [25] and Supplementary Material. To use the components' failure rates (initially estimated using the exponential distribution) as components with the Weibull probability distribution, the correction ratio values were provided in [53].
In addition, the failure rates were assumed to be zero for all the STPA causal factors related to the flawed process model, except for the failure rates depicting errors related to the intelligent DG set diagnosis responsible for the identification of system load imbalances.

Analysis Assumptions
The following conservative assumptions were made for analysis purposes: • Any electrical load sharing imbalance can be corrected by the PMS in 90% of the cases, whereas if an intelligent generator diagnosis is provided in the system, this system manages all the electrical load sharing imbalances by tripping the faulty DG set. • An uncontrolled electrical load sharing imbalance will lead to a blackout in half of the cases, whilst only one DG set will be lost for the other half. • Prewarning functions will allow the safe switch over to another DG set in 50% of the cases when a lubrication oil low-pressure alarm, high exhaust gas temperature alarm and high cooling water temperature alarm are present in one of the operating DG sets.

•
The power plant operates with the bus-tie circuit breaker connected in all operational modes.

•
It should be noted that the system operation with six DG sets is very rare for the reference system (less than 1% of the total ship operational time), so it was set at 1% to assess the influence of the system configuration with six DG sets operating on the overall blackout frequency. • Any short circuit not cleared by the protection system will lead to the DG sets' overcurrent and a consequent blackout.

•
The tripping of air conditioning motors, bow thrusters and other loads causes insignificant electrical transients. Significant electrical transients are caused by the loss of operating propulsion motors and DG sets. • An uncontrolled arc failure in the switchboard will cause a loss of one electric power section of the DEP plant. This is a realistic assumption, as an uncontrolled arc may result in switchboard destruction. • Any fire in an engine room will lead to the loss of all the generator sets in this engine room. The list of the generated sub hazards from the STPA for the investigated DEP system that can lead to a blackout event along with the safety constraints and the existing safety measures are presented in Table 5. These hazards were identified based on previous publications such as [39,41,43,[54][55][56][57]. The safety constraints and the existing safety measures are also given in Table 5. The identified sub hazards are not related to the system component failures and transfer the focus of analysis to the general system state. This is an advantage of this study compared to the previous studies [9,15,18,20] that consider only the DG sets' availability. Herein, conditions such as imbalanced power generation, operating DG set overload and electrical transients are considered. The presented sub hazards are of the high-level type, and they most likely could be identified using a Preliminary Hazard Analysis (PHA) method. However, the PHA cannot support the Unsafe Control Actions (UCAs) and their related causal factor identification. The system must be resilient to the presence of the transients in the network and prevent their existence in the system Tripping function settings proper selection, design parameters of DG sets, control over propulsion motors during the start H-5 Electrical disturbances such as short circuits The system must prevent the occurrence of short circuits and not allow the short circuit and arc fault to be uncontrolled Protection relays, arc detection systems The investigated DEP system control structure (CASA Step 2) was developed based on the information in the manufacturers' manuals and other publications cited in Section 3.1. The developed overall control structure is presented in Figure 4a, whereas the typical detailed description of the engine governor is provided in Figure 4b. The intelligent generator diagnosis system is also included in Figure 4a. The STPA investigated system UCAs (CASA Step 3) were derived with the support of the open-source software XSTAMPP [58], by considering all the possible failure modes of the control actions; in total, 78 UCAs were identified. A considerable number of UCAs (19/78 or 24%) were related to the PMS functions, whereas six of them were related to intelligent diagnosis. Proceeding from the higher to the lower controller hierarchical levels, the number of UCAs decreases, as the controllers' functionalities reduce in number. The greater percentage of the UCAs (56%) was related to the DG set overload hazard H-3. The incorporation of the UCAs leading to blackout for the investigated DEP system is one of the differentiating elements of the Fault Tree that is developed in the next steps compared to the Fault Trees presented in [9,15,18,20]. In this respect, the presented analysis more effectively captures the software-intensive character of the investigated DEP system. The investigated DEP system control structure (CASA Step 2) was developed based on the information in the manufacturers' manuals and other publications cited in Section 3.1. The developed overall control structure is presented in Figure 4a, whereas the typical detailed description of the engine governor is provided in Figure 4b. The intelligent generator diagnosis system is also included in Figure 4a.   The system must prevent the occurrence of short circuits and not allow the short circuit and arc fault to be uncontrolled Protection relays, arc detection systems The second step of the STPA (CASA Step 4) includes the identification of the causal factors contributing to the DEP system UCAs. For each UCA, 1 to 10 causal factors were identified. This task was repeated for all the 78 UCAs. On average, 3.8 causal factors per UCA were identified (299 in total, considering intelligent diagnosis). The distribution of all causal factors per category is shown in Figure 5. It is observed that the dominant factors were related to: (a) the flawed control algorithm implementation; (b) the inconsistent process models; (c) the flawed process model input from sensors to controllers, and; (d) the inappropriate transmission of the control signal to actuators. In addition, failures in actuators leading to the flawed execution of control actions were identified as important causal factors. Fewer causal factors were identified related to conflicting control actions, missing output from controllers due to their failure and inappropriate control input. These results are attributed to the fact that the STPA more effectively highlights the importance of the software functions for the system, thus supporting the identification of the causal factors related to the control hardware and software including flawed control algorithms, flawed process models and flawed process model input parameters [30].
ators leading to the flawed execution of control actions were identified as important causal factors. Fewer causal factors were identified related to conflicting control actions, missing output from controllers due to their failure and inappropriate control input. These results are attributed to the fact that the STPA more effectively highlights the importance of the software functions for the system, thus supporting the identification of the causal factors related to the control hardware and software including flawed control algorithms, flawed process models and flawed process model input parameters [30].

ESI Results
The application of ESI is crucial for capturing the DEP dynamic reconfiguration functions [33]. The five sub hazards that were identified for the investigated system were used as initiating events in the ESI "Event Trees" development phase for the sub hazards H1 to H5 (Table 5).

ESI Results
The application of ESI is crucial for capturing the DEP dynamic reconfiguration functions [33]. The five sub hazards that were identified for the investigated system were used as initiating events in the ESI "Event Trees" development phase for the sub hazards H1 to H5 (Table 5).
A resultant example ESI "Event Tree" showing the propagation of two of the sub hazards, namely DG set unavailability H-1 and operating DG set overload H-3, leading to blackout is presented in Figure 6. It is expected that blackout will occur provided that the DG sets' overloading is not properly handled by the system (reducing the DG set overload by tripping the AC motors or reducing the propulsion motor electrical load). The unavailability of DG sets will indirectly lead to the DG set overload.

Safety 2021, 7, x FOR PEER REVIEW 11 of 24
A resultant example ESI "Event Tree" showing the propagation of two of the sub hazards, namely DG set unavailability H-1 and operating DG set overload H-3, leading to blackout is presented in Figure 6. It is expected that blackout will occur provided that the DG sets' overloading is not properly handled by the system (reducing the DG set overload by tripping the AC motors or reducing the propulsion motor electrical load). The unavailability of DG sets will indirectly lead to the DG set overload.

STPA and ESI Results Integration
The Fault Tree derived from the synthesis of the ESI results is presented in Figure 7 (CASA Step 6). The developed Fault Tree is quite extensive and includes 13 levels, 21 AND gates, 9 OR gates and 57 undeveloped events; hence, it was not possible to present it to its full extent. The Fault Tree shown in Figure 7 demonstrates the complexity of the interactions between the different sub hazards in the investigated system. The operating DG set overload leading to a blackout event is also represented in this FT to show the relationship between Fault Trees shown in Figures 7 and 8.

STPA and ESI Results Integration
The Fault Tree derived from the synthesis of the ESI results is presented in Figure 7 (CASA Step 6). The developed Fault Tree is quite extensive and includes 13 levels, 21 AND gates, 9 OR gates and 57 undeveloped events; hence, it was not possible to present it to its full extent. The Fault Tree shown in Figure 7 demonstrates the complexity of the interactions between the different sub hazards in the investigated system. The operating DG set overload leading to a blackout event is also represented in this FT to show the relationship between Fault Trees shown in Figures 7 and 8. (CASA Step 6). The developed Fault Tree is quite extensive and includes 13 levels, 21 AND gates, 9 OR gates and 57 undeveloped events; hence, it was not possible to present it to its full extent. The Fault Tree shown in Figure 7 demonstrates the complexity of the interactions between the different sub hazards in the investigated system. The operating DG set overload leading to a blackout event is also represented in this FT to show the relationship between Fault Trees shown in Figures 7 and 8.
Following the STPA results' integration into the developed Fault Tree (CASA Step 7), its size became extremely large, as for each event of the initial "Event Trees" and consequently to the Fault Tree two levels were added, exponentially increasing the number of gates and undeveloped events corresponding to the UCAs and the causal factors, respectively. Refinement for the UCAs context was applied for 40 out of 78 UCAs in the reference system (CASA Step 8). Typical examples include the UCAs for starting the DG sets and controlling the position of the bus-tie breaker. The grouping of the interconnected UCAs was applied for the UCAs related to the DG sets starting, controlling the propeller speed, and thus, the load of the electric propulsion motors as well as the UCAs for controlling the bus-tie circuit breaker position. The electrical load transients may be caused by different events (fast increase in propulsion power or sudden loss of a heavy electrical consumer), which will increase or decrease the operating DG sets' power output leading to potential imbalanced load sharing between the connected DG sets. The causal factors for the occurrence of the UCAs leading to imbalanced load sharing between the DG sets in both cases are the same, so their merging can be applied. The PMS hardware failure and the DG sets' speed and voltage sensors' erroneous measurements were identified as common causal factors to many UCAs and were promoted to a higher level. Contradictions were found in the UCAs related to the PMS functions. The PMS cannot start a DG set and cannot handle a load imbalance or overload when the PMS hardware failure occurs. An additional refinement was applied to the UCAs related to the DG sets' physical failures. An extract from the refined Fault Tree describing the conditions leading to blackout due to operating DG set overload based on the "Event Tree" of Figure  6 is presented in Figure 8. As it is observed from this figure, the refinement was applied in case of (a) not starting a DG set when a DG set has a failure; (b) not starting a DG set when the load demand is high, and; (c) for the PMS hardware failure. The DG sets and other failures are further analysed using the FTA as described in the next section. Following the STPA results' integration into the developed Fault Tree (CASA Step 7), its size became extremely large, as for each event of the initial "Event Trees" and consequently to the Fault Tree two levels were added, exponentially increasing the number of gates and undeveloped events corresponding to the UCAs and the causal factors, respectively.

FTA results
Refinement for the UCAs context was applied for 40 out of 78 UCAs in the reference system (CASA Step 8). Typical examples include the UCAs for starting the DG sets and controlling the position of the bus-tie breaker. The grouping of the interconnected UCAs was applied for the UCAs related to the DG sets starting, controlling the propeller speed, and thus, the load of the electric propulsion motors as well as the UCAs for controlling the bus-tie circuit breaker position. The electrical load transients may be caused by different events (fast increase in propulsion power or sudden loss of a heavy electrical consumer), which will increase or decrease the operating DG sets' power output leading to potential imbalanced load sharing between the connected DG sets. The causal factors for the occurrence of the UCAs leading to imbalanced load sharing between the DG sets in both cases are the same, so their merging can be applied. The PMS hardware failure and the DG sets' speed and voltage sensors' erroneous measurements were identified as common causal factors to many UCAs and were promoted to a higher level.
Contradictions were found in the UCAs related to the PMS functions. The PMS cannot start a DG set and cannot handle a load imbalance or overload when the PMS hardware failure occurs. An additional refinement was applied to the UCAs related to the DG sets' physical failures. An extract from the refined Fault Tree describing the conditions leading to blackout due to operating DG set overload based on the "Event Tree" of Figure 6 is presented in Figure 8. As it is observed from this figure, the refinement was applied in case of (a) not starting a DG set when a DG set has a failure; (b) not starting a DG set when the load demand is high, and; (c) for the PMS hardware failure. The DG sets and other failures are further analysed using the FTA as described in the next section.

FTA results
The FTA is used to further develop some events in the refined Fault Tree of the previous step; specifically, FTA was applied for the analysis of the failures in one DG set, its auxiliary systems and the ship's propulsion electric motors. The Fault Tree derived for the main engine failures leading to the engine shut down is presented in Figure 9. This Fault Tree was developed based on information provided in [59][60][61][62][63][64][65]. However, it differentiates from the information provided in the mentioned resources in the way the failures are organised and presented, as attention was given to the conditions leading to the engine shut down. In this Fault Tree, the failures of the air starting system are not incorporated, as the air supply system is engaged only during the engine starting procedure. In addition, failures leading to the deterioration of the system performance are not considered as a cause of the engine shutdown. The critical alarms of the system leading to the system shut down are activated by: (a) failures of the DG set control hardware; (b) high cylinder liner temperature; (c) high cooling water temperature; (d) high thrust bearing temperature; (e) high main bearing temperature; (f) low lubrication oil pressure; (g) increased oil mist concentration, and; (h) other failures affecting the engine output.

Phase 2 Top Event Frequency/Failure Rate Estimation
The blackout failure rate (λ B ) for the cases where the investigated DEP system employs a different number of DG sets simultaneously operating in the general phase is presented in Figure 10a,b. It can be deduced that the λ B is significantly higher when only one DG set operates, as a single point failure in the operating DG set or its auxiliary systems will lead to a blackout. In addition, due to the operational profile of the cruise ship and the DG sets' loading conditions, DG set overload conditions will occur more frequently when running with two or three DG sets (in comparison with the cases where more DG sets operate), which leads to greater λ B values in these cases. Furthermore, operating with five operating DG sets provided a slightly greater λ B in comparison with the λ B when operating with four DG sets. This is primarily owing to the DG set loading profile and secondarily to the fact that more components are used in the system, so it is more probable that a failure will occur.   Figure 10. Blackout failure rate ( ) for different total number of DG sets operating (a) 1-6 totally DG sets operating; (b) 2-6 totally DG sets operating (to be noted that the failure rate is much higher for 1 operating DG set ).

Phase 3 Importance Measures Estimation
The calculated values for the general (case study xii) and the sailing (case study xiv) operation phases are presented in Table 7. The is used to represent the most probable failure leading to a blackout; higher values denote a higher probability that these failures will lead to a blackout. The results for the harbour and the manoeuvring operation phases were similar to the results for the general operation phase. As it can be inferred from Table 7, the mechanical failures leading to the loss of one DG set have greater importance in the general operation phase than in the sailing phase. These include Figure 10. Blackout failure rate (λ B ) for different total number of DG sets operating (a) 1-6 totally DG sets operating; (b) 2-6 totally DG sets operating (to be noted that the failure rate is much higher for 1 operating DG set).
It can also be inferred from Figure 10 that a substantial reduction in λ B value can be achieved for a specific system configuration for the cases where prewarning functions (case ii) are fully operational, allowing for the switching over to a different engine in case of critical alarm activation. This implies that the implementation of advanced prognostic and diagnostic techniques will improve the investigated DEP system's safety for the case where one DG set operates, as it will allow for a reliable fault prediction and a timely system reconfiguration. In addition, it can be deduced that the λ B is sensitive to the DG sets' operating profile, since a small increase in loading (3% of Maximum Continuous Rating (MCR) point power) for each specific configuration leads to a considerable λ B increase (case iii). The inspection and maintenance intervals (case iv) seem to only slightly affect the λ B as the maintenance and inspection of some critical components are already frequent and the influence of maintenance intervals can be investigated only for a number of the system components. The addition of intelligent diagnosis (case iv) for handling load sharing errors has a positive effect on λ B in the cases where a greater number of DG sets than three operates. According to the derived operational profile, this is less frequent though, applicable to 25% of the operational time (Table 4). Finally, the preferential tripping function parameters have a direct impact on the λ B similar to the DG sets' loading profile; the less load is tripped, the higher the λ B (case vi).
The derived results for the top failure rate estimation are presented in Table 6. The estimated frequency of blackout (FOB) in the general phase is higher than but relevantly close to the value of 0.1 events per ship-year, which was also reported in [66], and to the 0.85 events per ship-year estimated according to accident investigation reports, as shown in Table 6. However, in the harbour (ship at berth) phase, the FOB is significantly higher than the FOB in the general phase. This is due to the fact that the system often operates with a single DG set connected to the ship's power network. In the manoeuvring phase, a number of DG sets operate at lower loads, which leads to a lower FOB value. This is attributed to the fact that more DG sets operate in the manoeuvring phase as a safety precaution. In the sailing phase, due to the increased number of the operating DG sets, the FOB is found to be approximately 0.003 events per ship-year and is much lower than the respective values for the other phases and the one reported by Friis-Hansen, Ravn and Engberg [66]. However, it must be noted that human error-induced blackouts, as well as blackouts caused by disconnection from the port network in the harbour phase, are not considered in the blackout frequency calculations. Furthermore, the estimation of 0.1 average events per ship-year refers to the cruise ships and passenger vessels fleet and does not consider the specific differences between the different cruise ships' propulsion systems, which have an important influence on the FOB calculation as discussed above.

Phase 3 Importance Measures Estimation
The calculated I FV j values for the general (case study xii) and the sailing (case study xiv) operation phases are presented in Table 7. The I FV j is used to represent the most probable failure leading to a blackout; higher I FV j values denote a higher probability that these failures will lead to a blackout. The results for the harbour and the manoeuvring operation phases were similar to the results for the general operation phase. As it can be inferred from Table 7, the mechanical failures leading to the loss of one DG set have greater importance in the general operation phase than in the sailing phase. These include the failures in the cooling water and the lubricating oil systems as well as the engine failures leading to an erroneous/missing output. The blackout failure rate is adversely affected by errors in the control systems including the PMS command leading to (a) a running DG set stopping, (b) fuel quick closing valve faulty operation, (c) faulty DG set tripping by the safety systems and (d) erroneous sensor measurements of the engine bearing temperature. Failures leading to a DG set tripping without prewarning including failures in the control system hardware or shaft failures leading to a DG set stop were also identified as important.
In the sailing phase, anomalies in the load sharing and control are of greater importance than in other phases. Such failures include erroneous DG set speed measurements, failures in fuel racks and failure in the propulsion motors fast load reduction. Fuel leakages and control hardware failures were also identified as important contributors to the λ B increase. can be compared with available data from accident investigation reports and Protection and Indemnity (P&I) insurance club categories [31] by aggregating the I FV j values for the different failure categories leading to a blackout and analysing the overall contribution of each category (I FV j OM ). The comparison of the calculated parameters with other data sources is shown in Table 8. The derived results, in general, are in line with the results derived from accident investigation reports provided by a cruise ship operator as well as the results from a published P&I club study [31]. Differences in the estimated causal factor percentage in the various operating phases can be attributed to the fact that the importance of the mechanical failures changes from one operation phase to another as the mechanical failures are of greater importance when fewer DG sets operate. According to this analysis' results, the mechanical, electrical and control failures have a higher contribution to the λ B value, whilst failures in the fuel system are found to contribute less to the λ B value, in comparison to the respective contribution estimated according to the P&I results and the available accident investigation results. The observed deviations are justified by the fact that both the P&I club and accident investigation report results were derived based on blackout analyses from a number of ships with different functionalities and design redundancy level, which, as it was explained in Section 5.2, contributes to the system performance variation. In addition, often these reports do not capture the actual accident causes. In this respect, they can be used only for a high-level comparison with the calculated results of the present study.
The ten failures with the greater estimated I B j values for the general (case study xii) and the sailing (case study xiv) operation phases, indicating their "structural" importance, are given in Table 9. The results for the harbour and the manoeuvring operation phases were similar to the results of the general operation phase. As it can be inferred from the general operation phase results, the blackout failure rate is sensitive to (a) failures in sensors used for the DG sets tripping in case of a short circuit, and (b) failures in the thrust bearings sensors due to multiple sensors employed. In the general phase, the blackout failure rate is also sensitive to failures leading to sudden tripping of DG sets without prewarning, such as failures in hardware used for DG set control, piston failures, and lubricating oil pressure and fresh water cooling system temperature sensor failures. In addition, the λ B was found sensitive to short circuits and differential current failures due to the fact that: (a) a 3-phase Alternate Current electric system is used, and; (b) the occurrence of the short circuit leads to a DG set tripping without prewarning. For the sailing operation phase, the λ B is sensitive to failures related to the system power reduction functions, such as failures in the DG set and the propulsion motor power sensors as well as failures in sensors and the actuator used for the power control in the DG sets. High λ B sensitivity was identified with respect to design errors including overwhelming electrical transients in the system and DG set circuit breaker failures. The proper operation of the DG set circuit breaker is important to ensure the DG set tripping when a number of failures in the DG set occur, as otherwise, it will lead to prolonged DG set maintenance.

Safety Recommendations
Overall, the derived results indicate that the failures of the DG sets, failures without pre-warning alarms and the failures that can lead to the simultaneous loss of a number of DG sets are the most significant for the blackout failure rate. These findings indicate that the engine room redundancy required by Safe Return to Port regulations prevents a number of scenarios leading to blackout; however, it cannot address all the hazardous scenarios as explained below. Additionally, blackout prevention requires the reliable operation of the preferential tripping and the propulsion motors load reduction functions. On the other hand, failures of the propulsion motors (except for those related to the power reduction functions) and failures in the electrical power network seem to be of less importance for the λ B in the investigated DEP systems.
Based on this analysis' results for the investigated system, the following safety recommendations can be provided with respect to design and operation, which can also be taken into consideration for other ship power plants: • Ship operation with one DG set should be avoided, as it results in considerably higher The propulsion motors fast electrical power reduction function, the power increase control function and the preferential tripping function should be thoroughly examined during the system design phase and extensively tested during the ship sea trials. These software supported system functions must also be thoroughly tested following software updates. • Adequate redundancy in speed and voltage sensors should be provided or intelligent monitoring techniques should be employed to avoid failures in the electrical power control system leading to a load imbalance and a blackout.

•
The condition of the DG sets' fuel racks needs to be closely monitored by using advanced diagnosis and prognosis techniques.

•
The tripping of DG sets due to sensor failures can be reduced by employing relevant fault tolerance techniques allowing the diagnosis and by-passing of relevant sensor failures.

•
The tripping of DG sets due to failures in the control system hardware can be reduced by closer monitoring of the DG set components' health; for example, by monitoring the generator's electrical parameters (current, voltage, leakage currents, impedance changes) [67].

•
The employed DG sets' size, loading profile and overload limits should be carefully selected to avoid overload conditions in case of one or more DG sets tripping.

•
The prevention of failures leading to the simultaneous loss of a number of DG sets, such as a fuel quick closing valve faulty operation, a fire in the engine room and clogged sea chests, should be ensured.

•
Meticulous design and testing of the components/subsystems with multiplicities such as piston assemblies must be ensured for DG sets.

Conclusions
In this study, the CASA method was employed for the safety analysis of a DEP system. Through its application, the blackout failure rate and frequency for the cruise ship power plant were estimated for different operation phases and varying design parameters as well as different operating power demand profiles. Various case studies were investigated, including the addition of new functions and intelligent prewarning capability for the system components. This method provided quantification of the considered blackout event frequency (and probability) as well as criticality metrics, leading to the identification of the most contributing factors that impact the blackout events. Based on the derived results, relevant safety recommendations for the investigated cruise ship DEP system were derived.
It was found that the overall blackout frequency for the investigated cruise ship power plant was around 0.4 events per ship-year, whilst the blackout frequency was calculated as 0.003 events per ship-year in the sailing phase and 1.5 per ship-year in the harbour phase.
It was deduced that the DG set loading conditions and the number of DG sets connected to the ship's electric network have a significant influence on the blackout failure rate, and therefore the blackout frequency can be reduced by controlling them.
The reliable operation of the PMS fast electrical load reduction, the prewarning and reconfiguration functions was found to be crucial for avoiding blackout events.
In cases where a number of DG sets operate, failures in the components used for the electrical power generation control, such as the DG sets' fuel racks, the electric power sensors or/and the propulsion motors load reduction functions, become more important. The mechanical component failures, such as lubrication oil or cooling water system failures, become more important in cases where a small number of DG sets operate. Failures leading to the simultaneous loss of multiple DG sets are also important from a blackout perspective, in cases where a smaller DG set number operates.
In summary, this study demonstrated that the employed method allowed the assessment of the impact of different parameters on the overall system's undesired event failure rate overcoming the STPA limitations. It is also expected that the results of this analysis will support the design of safer DEP systems. Future work could focus on the estimation of additional safety metrics for the investigated DEP system, such as blackout duration, partial blackout probability, and blackout risk, as well as on developing intelligent diagnosis techniques for the DEP system. Funding: Part of the research has been founded by the "NEXUS-Towards Game-changer Service Operation Vessels for Offshore Windfarms" project that was funded from the European Union's Horizon 2020 research and innovation action under grant agreement N • 774519.