2. State of the Art
The concept of dependability in the context of power supply systems began to attract scholarly attention only in the early 1990s. The earliest known reference dates to 1993 [
17], where it was emphasized that achieving both high-quality power supply and effective cost control represents a complex engineering challenge. The authors proposed a new approach based on integrating cost analysis with reliability indicators; however, their methods were limited exclusively to medium-voltage distribution equipment.
The monograph [
18], published in 1996, provided a comprehensive exposition of reliability theory with applications to power systems. Particular attention was devoted to distributed generation. The authors identified two major drawbacks of centralized power plants: substantial investment costs and the risk of large-scale adverse impacts on society and the environment in the event of a production shutdown. At the same time, the reliability issues of distribution networks were effectively not analyzed.
Study [
19] notes that the assessment of dependability indicators in electrical systems is a necessary prerequisite for ensuring their stable and reliable operation under conditions of increasing energy system complexity and decentralized generation. Determining the level of dependability enables the timely identification of critical system components, prediction of failure risks, and informed decision-making regarding reliability enhancement and component redundancy. Such an approach strengthens the resilience of power systems to emergency conditions and external impacts, which is a key factor in the sustainable development of regional and national power sectors. The integration of dependability assessment methods into energy planning strategies ensures effective resource management and optimization of investments in infrastructure modernization.
The problem of sustainable development in “smart” regions under conditions of escalating hybrid threats and sociotechnical instability is examined in [
20]. The authors propose a scientifically grounded approach to risk assessment for critical infrastructural and functional systems that underpin digitally transformed regional territories. The methodology is based on fuzzy set theory and logico-linguistic analysis, which enables the incorporation of parameter uncertainty, fragmented expert information, and the absence of a unified “risk picture” in complex infrastructural environments.
Particular attention is devoted to the technogenic, informational, and mobile components of infrastructure that ensure regional viability across the phases of planning, response, and recovery. The obtained results may serve as a conceptual and methodological foundation for the development of a fuzzy risk assessment system that can be integrated into sustainable development strategies for smart regions at the levels of regional governance, cross-sectoral planning, and security policy.
An empirical analysis of the impact of smart city development on national sustainable development indicators is presented in [
21]. Drawing on 29 years of statistical data from Turkey, the author demonstrates that the development of smart cities has a statistically significant positive effect on sustainable development goals, particularly in terms of environmental efficiency, pollution reduction, improved governance quality, and enhanced social well-being. At the same time, the author emphasizes the necessity of a comprehensive approach: digital solutions must be accompanied by policies promoting environmental and social inclusivity to ensure long-term effectiveness.
The conclusions of the study are important for shaping national and regional policies, indicating that smart initiatives can serve as an effective instrument for sustainable development when implemented with consideration of the local context and broader socio-economic transformations.
The implementation of the “smart sustainable urban development” concept using the example of planning Indonesia’s new capital is examined in [
22]. The authors analyze how the integration of smart technologies, intelligent urbanism, digital infrastructure, and sustainability principles can serve as a foundation for future urban growth, long-term planning, climate resilience, and effective governance. The findings demonstrate that such integration makes it possible to combine the advantages of rapid urban development with adherence to environmental, social, and economic standards, thereby improving quality of life, enabling climate adaptation, and promoting balanced growth. This provides a concrete practical case illustrating how smart + sustainable concepts can become a strategic framework for the development of a city or region, particularly in contexts involving new urban construction or large-scale urbanization plans.
Thus, the reviewed studies consistently show that dependability is a key characteristic of modern electrical energy systems, ensuring their stable operation amid decentralization, digitalization, increasing load, and the demands of sustainable development. Without its quantitative assessment, it is impossible to guarantee the reliability, adaptability, and security of energy systems at both local and regional levels.
The assessment of dependability in power supply systems has traditionally relied on several methodological approaches, which can be grouped into a number of major categories. In the following discussion, a general classification of these methods is first provided, followed by a detailed analysis of each category with consideration of its specific features and potential areas of application.
Deterministic methods for dependability assessment are based on the use of fixed system parameters and predefined operational scenarios. On this basis, indicators of reliability, availability, and continuity are calculated without accounting for stochastic factors [
23]. The primary advantage of this approach is its relative simplicity: such models do not require substantial computational resources and can often be implemented on standard personal computers. Moreover, the time required to obtain results is typically minimal [
24].
However, the absence of probabilistic components significantly limits the practical value of deterministic methods, as they cannot account for random failures or complex interdependencies between system components. Scaling these models to large or highly complex systems is also challenging, which reduces their suitability for in-depth studies. The limitations of this approach are thoroughly discussed in [
25]. In practice, deterministic methods are used mainly for preliminary analysis of simple systems or in cases where the application of more advanced approaches is infeasible [
26].
Probabilistic methods differ in that they rely on the analysis of statistical data on component failures and the distributions of their operating times. This group includes well-known techniques such as Failure Modes and Effects Analysis (FMEA) [
27,
28] and Fault Tree Analysis (FTA) [
29], which are widely used to evaluate reliability indicators, including mean time to failure, mean time to repair, and the probability-of-failure-free operation function
[
30]. The use of real statistical data enhances the accuracy of results, and the explicit consideration of random events makes these methods effective for systems of moderate complexity.
However, their effectiveness strongly depends on the availability of reliable statistical information. In its absence, the accuracy of dependability assessment decreases. Additionally, the capability of these approaches to model complex dynamic processes remains limited. For these reasons, probabilistic methods are applied primarily to systems with a relatively small number of components or for evaluating general reliability characteristics.
Markov analysis occupies a special place among dependability assessment methods, as it enables modelling of system state changes and transitions between states while accounting for both failures and recovery processes [
31]. This approach makes it possible to represent complex interdependencies and incorporate rare events, rendering it applicable to systems of virtually any complexity. Its principal advantages include the dynamic nature of the analysis and the high accuracy of the results. At the same time, constructing Markov models requires specialized knowledge of the system under study and detailed information about state parameters and transition probabilities. An additional limitation is the significant computational burden, which grows rapidly as system structure becomes more complex. A detailed examination of the advantages and drawbacks of Markov models is provided in [
32], while the computational resources required for their practical implementation are thoroughly described in [
33]. Consequently, Markov models are typically employed for analysing complex, multi-component systems in which it is essential to account for a wide range of interdependencies and behaviors under unpredictable failure conditions.
The Monte Carlo method belongs to iterative approaches and is based on performing a large number of simulations of random scenarios in order to obtain statistical estimates of system reliability parameters [
34]. In [
35], a comprehensive analysis of the advantages and disadvantages of this method is presented, using dynamic biological systems as an example and applying a combination of Monte Carlo and Markov models. Although the domain under consideration is highly specific, the conclusions drawn can be extended to other classes of dynamic systems, including energy systems.
Among the main strengths of this approach is its universality, as the method allows the incorporation of a wide range of factors, including random destructive events. An important characteristic is the absence of strict dependence on the form of the probability distribution of random variables, which makes Monte Carlo simulation applicable to systems with diverse operational characteristics. In addition, this approach enables the modelling of non-standard or unique scenarios that are difficult to describe using other methods.
At the same time, the drawbacks of this method include substantial computational demands arising from the need to perform a large number of iterations, which necessitates the use of high-performance hardware [
36]. The long execution time becomes particularly critical for complex systems with many components. Furthermore, as noted in [
37], obtaining reliable results in cases involving rare events requires an even greater number of simulations, thereby reducing the efficiency of the approach.
Regarding resource requirements, the Monte Carlo method is characterized by a high demand for computational power, often necessitating the use of servers, clusters, or even supercomputers. The volume of input data is more flexible and depends on the level of detail required by a particular model. However, the time needed to carry out the computations is generally significant. As the authors of [
38] note: “The reliability assessment of large power systems, particularly when considering both generation and transmission facilities, is a computationally demanding and complex problem. The sequential Monte Carlo simulation is arguably the most versatile approach for tackling this problem. However, assessing sampled states in the sequential Monte Carlo simulation is time-intensive, rendering its use less appealing.” In practice, the Monte Carlo method is typically used for analysing complex systems for which deterministic or classical analytical methods are infeasible or ineffective, and where the advantages of Monte Carlo simulation outweigh those of alternative approaches. It is also applied to evaluating the behavior of systems characterized by a high degree of uncertainty.
As an intermediate conclusion, it should be noted that among the existing approaches to dependability analysis of power supply systems, deterministic methods are characterized by minimal requirements for computational resources, processing time, and the volume of input data. At the same time, their applicability under conditions of disruptive influences is limited. This is due to the fact that adequate representation of system operating modes requires the use of complex mathematical models expressed in terms of currents, voltages, or transmitted power, which complicates the analysis of atypical scenarios. As a result, the accuracy of the outcomes remains low, and such methods cannot be considered suitable for investigating complex situations.
Probabilistic methods offer a certain compromise between resource demands, accuracy, and the amount of required data. Their suitability for analysis stems from the probabilistic nature of disruptive influences, which allows the modelling of random processes. However, a significant limitation is the inability to adequately capture the temporal dynamics of the consequences of these influences. Consequently, the accuracy of the results—particularly for complex power supply systems—remains insufficient.
The most accurate approach for analysing dynamic systems with a high level of detail is generally considered to be Markov analysis. It makes it possible to account for temporal dependencies and transitional states, which is particularly valuable when studying the evolution of disruptive influences. However, the application of this method requires prior estimation of transition probabilities between system states. In the case of complex energy systems, the number of possible states may be extremely large, and the transition probabilities often remain uncertain. This necessitates the use of expert-based assessment methods as well as the development of techniques for reducing the state space in order to ensure the practical feasibility of the analysis.
The Monte Carlo method is distinguished by its exceptional flexibility and its ability to reproduce virtually any scenario. Its accuracy is high; however, the substantial time requirements and the need for significant computational resources make its application to reliability assessment of complex power supply systems less practical.
It is important to emphasize that none of the methods reviewed is designed specifically for dependability assessment. However, since dependability is an integrated criterion in which reliability plays a central role, reliability evaluation methods can be used as a foundation for dependability analysis. Dependability indicators—such as mean time to failure or failure probability—can be derived on the basis of results obtained from these individual methods.
Thus, in assessing the dependability of electrotechnical systems, the most appropriate approach is the application of Markov analysis. Its practical implementation requires performing a procedure for reducing the number of system states considered during the modelling of disruptive influences, in order to maintain an optimal balance between result accuracy and acceptable computational effort.
One of the main challenges in applying Markov processes to the modelling of complex technical and software systems is the state-space explosion phenomenon. This occurs when the number of system states grows exponentially with the increase in the number of components, subsystems, or parameters [
39,
40,
41]. Such situations are typical for multicomponent fault-tolerant structures, detailed repair and maintenance models, as well as for problems involving dependent events or correlated failures. As a result, analytical methods become impractical due to the enormous size of intensity matrices, computational costs exceed available resources, and the interpretation of obtained results becomes significantly more difficult [
38,
39]. An additional complication arises from the limitations of Monte Carlo methods, which require an excessive number of simulation trajectories to ensure statistical convergence in large state spaces [
42]. Thus, the state-space explosion severely restricts the practical applicability of Markov analysis and necessitates the use of specialized model-reduction techniques.
To address this issue, a number of dimension-reduction methods have been proposed. The most well-known among them are Method of Aggregation and Decomposition (MAD), Modular Performance Modeling (MPM), Stochastic Reward Nets (SRN), and Hierarchical Hidden Markov Models (HHMM).
In the following section, we provide a detailed analysis of these dimension-reduction techniques in the context of Markov-based modelling.
2.1. Method of Aggregation and Decomposition
The Method of Aggregation and Decomposition (MAD) emerged as a classical tool in Markov analysis for overcoming the state-space explosion problem. Its essence lies in two complementary operations: state aggregation and system decomposition. Aggregation involves combining groups of similar states into macrostates in such a way that the key probabilistic characteristics of the original model are preserved. Decomposition, in turn, consists in partitioning a large system into relatively independent subsystems that can be analyzed separately, with their results subsequently integrated into a simplified global description [
43,
44].
A major achievement of this approach was the formal demonstration that systems with a nearly completely decomposable structure (NCD models) are well suited for MAD-based analysis, particularly in cases where internal transitions within subsystems dominate over inter-subsystem transitions [
45]. This makes it possible to replace a complex system with a set of local models while maintaining controllable approximation errors. The key advantage of MAD is its ability to handle extremely large models—a property that is especially valuable in telecommunications networks, IT infrastructures, and complex energy systems [
46].
However, the method also has significant limitations: approximation errors arising from subsystem aggregation may accumulate in the global model, making formal error control essential. Contemporary studies therefore focus on developing algorithms for adaptive refinement of aggregated states and on establishing formal error bounds for MAD-based models [
47,
48].
2.2. Modular Performance Modeling
The Method of Modular Performance Modeling (MPM) was developed from the idea that complex systems can be most effectively analyzed when viewed as a collection of functional modules. Each module is represented by its own Markov model describing its internal behavior, while the integration of these individual models forms a simplified representation of the system as a whole [
49]. This approach avoids the need to directly construct an enormous state space and instead replaces it with the analysis of separate, smaller-scale modules.
A key advantage of MPM is its high flexibility: models of individual subsystems can be reused, combined, and extended as needed. For this reason, the method has been widely applied in computer networks, distributed computing systems, and even transport modelling [
50].
However, MPM also has important limitations. Accurately representing complex dependencies between modules is difficult: such interconnections typically must either be simplified or incorporated through additional parameters, which reduces model precision. Contemporary research addresses these challenges by integrating MPM with deep learning techniques [
51] and computational intelligence methods, enabling more accurate modelling of inter-module dependencies and improving overall performance [
52].
2.3. Stochastic Reward Nets
Stochastic Reward Nets (SRNs) constitute an extension of classical Petri nets that combines graphical expressiveness with rigorous Markovian semantics. Unlike standard Petri nets, SRNs introduce the concept of rewards, which are associated with states or transitions and interpreted as performance indicators, availability metrics, or resource consumption measures [
53]. This enables the simultaneous modelling of system behavior and its efficiency.
State-space reduction in SRNs is achieved through the exploitation of structural properties such as invariants, symmetries, and combinatorial constraints, which makes it possible to avoid constructing the full underlying Markov model [
54]. The principal advantage of SRNs lies in their universality: the approach is employed both for analysing the performance of computer networks and for evaluating the reliability of complex engineering systems [
55]. A limitation, however, is the increasing complexity when modelling very large systems, as the state space may still become excessively large despite the applied reductions.
2.4. Hierarchical Hidden Markov Models
Hierarchical Hidden Markov Models (HHMMs) are a generalization of classical hidden Markov models in which each higher-level hidden state is itself a separate lower-level Markov model [
56]. This structure enables the representation of multi-level processes in systems that possess a natural hierarchical organization. Owing to this, HHMMs achieve effective state-space reduction, since the full model does not need to be constructed explicitly: analysis is carried out level by level rather than across the entire state space at once [
57]. The key advantage of HHMMs is their ability to capture complex hierarchical dependencies, which makes them suitable both for technical systems and for machine-learning tasks, including speech recognition and the analysis of biological sequences.
HHMMs can also be used to reduce the state space in electrical systems, although their application in this domain is still less common than in bioinformatics, ecology, or behavioral modelling. The central idea is that complex electrical systems also possess a multi-level structure: for instance, higher-level states may represent global operating modes of the power system (normal operation, overload, emergency state), whereas lower-level states correspond to local subsystems (generators, substations, individual components).
In a flat HMM, all possible combinations of such states would need to be encoded explicitly, leading to combinatorial explosion. HHMMs, by contrast, allow the construction of a tree-like hierarchy in which only the submodels corresponding to the current global regime are activated. This significantly reduces the effective state space and lowers the computational burden of Markov-based analysis. However, HHMMs also have important limitations: the number of parameters that must be estimated can be extremely large, complicating the training process and making it resource-intensive [
58].
Thus, the MAD, MPM, SRN, and HHMM approaches represent conceptually different strategies for addressing the state-space explosion problem. They demonstrate that state-space reduction may be achieved through structural simplification, modularity, the use of formal graph-based models, or hierarchical abstraction. Each method has its strengths and limitations, and current research actively explores the integration of these techniques and the development of hybrid methodologies.
The analysis of existing dependability-assessment methods indicates that none of the available approaches—deterministic, probabilistic, Markov-based, or Monte Carlo—offers an optimal balance between accuracy, scalability, and computational efficiency. Existing state-space reduction methods, including MAD, MPM, SRN, and HHMM, alleviate the exponential growth problem only partially and remain constrained when applied to complex electrotechnical systems. This underscores the need for a new approach that would combine structural decomposition and functional aggregation within a unified analytical framework, while preserving the properties of a Markov process to enhance the accuracy of dependability evaluation.
3. Methodology
3.1. Approach
The Triplet-based Integrated Clustering and Aggregation Method (TRICAM) is designed for performing Markov analysis of complex electrotechnical systems containing a large number of components, each capable of operating in several distinct states. Through multilevel clustering based on the formation of triplets (groups of three elements), the method enables a substantial reduction in the state-space dimensionality. This, in turn, makes it possible to obtain approximate dependability estimates even for systems comprising hundreds of components.
Within this approach, each individual component of an electrical system (a single device or a group of devices), as well as the system as a whole, may exist in one of three generalized states: (1) a state of full operability (operational level—100%), (2) partial or limited operability (operational level—50%), (3) complete inoperability (operational level—0%). The core idea of the method lies in the sequential structural decomposition of an initial electrotechnical system with a large number of elements (individual devices) into triplets, each formed from three system components. This means that already at the first stage of decomposition, the number of elements is reduced by a factor of three. Combined with the generalization of states into the three operability levels listed above, this results in a significant reduction in the state-space dimensionality.
To illustrate the reduction in state space after the first step of the method, consider a system containing
components, each of which can exist only in one of the three states described above. Accordingly, the dimensionality of the full state space of the initial system before applying the method (step zero of structural decomposition) is equal to:
After each step of structural decomposition, the number of elements in the system is reduced by a factor of three. Therefore, the number of elements after the
k-th decomposition step is given by:
and the number of states in the system at that stage can be computed as:
At the
k-th decomposition step, the state-space reduction factor can be calculated as follows:
To illustrate the state-space reduction achieved by the proposed method, let us consider a simple electrotechnical system that contains
N0 = 27 elements. The reduction in the state space depending on the step of the method, according to Formula (1), is shown in
Table 1.
The large numerical values presented in
Table 1 correspond to the dimensionality of the system state space and arise from the exponential dependence on the number of elements. These values are used solely to illustrate the scale of state-space growth in Markov models and to demonstrate the reduction achieved by the proposed method.
Simultaneously with the structural decomposition of the system, the TRICAM method involves a step-by-step aggregation of the state space, reducing it to only three states—100% operability, 50% partial operability, and complete inoperability. A key stage at each level of such aggregation is the construction of a transition matrix for the aggregated triplet states, which enables Markov analysis at the corresponding level of abstraction.
At the first decomposition step, this matrix is determined directly on the basis of information about the behavior of the elementary system components (that is, the individual devices forming the triplets), their individual reliability characteristics, and their interdependencies. However, at subsequent steps (the second, third, and so on), it becomes necessary to recalculate the transition matrices for the new aggregated entities formed by combining the triplets of the previous level.
This process is non-trivial, because the aggregated states of the new triplets do not have a direct correspondence to the original states of the elementary components. Therefore, a specialized state aggregation method is required—one that enables the formal derivation of transition probabilities between aggregated states based on Markov analysis, taking into account the combinations of internal states from the previous level. This method, termed the “three-state cluster aggregation method,” is not a standalone technique; rather, it is used within TRICAM after each step of structural decomposition.
Implementing the three-state cluster aggregation method requires adherence to probabilistic aggregation rules and preservation of Markov properties when transitioning to a higher level. Let us now examine this method in greater detail.
3.2. Three-State Cluster Aggregation Method
The three-state cluster aggregation method, built upon a formal Markov-based framework, serves as the fundamental tool applied at every level of the TRICAM hierarchical structure. Its primary objective is to perform a complete analysis of a single triplet composed of three elements, each of which is described by its own transition matrix.
The input data for the method consist of three-square 3 × 3 matrices corresponding to the components of the triplet: U1, U2, and U3. In addition, an aggregation rule is specified to determine the integrated state of the triplet. The rows and columns of these matrices represent the three possible states:
1—full operability (100%),
2—partial operability (nominally 50%),
3—complete failure (0%).
If an electric power supply system is considered as an example, its primary function is to provide consumers with electricity in the required volume and with acceptable quality parameters. The use of three-state aggregation (100%, 50%, 0%) reflects typical engineering operating regimes of the system—full operability, partial power supply, and complete failure—and represents a deliberate compromise that allows for a substantial reduction in the dimensionality of the Markov model while preserving its analytical tractability. The intermediate state (50%) is interpreted as an approximate level of partial fulfilment of the system’s primary function, corresponding to a reduced share of supplied load or number of served consumers, rather than a strictly fixed value or a probability; at the same time, the method in general allows for alternative aggregation rules with a different number of states depending on the required level of detail.
The matrix element located at the intersection of row i and column j specifies the probability that a component transitions from state i to state j. Thus, the main diagonal contains the probabilities of remaining in the same state under the influence of disruptive factors. The elements below the diagonal represent the probabilities of partial or full restoration of operability after a certain period of time.
Since each of the three elements may exist in one of three states, the full state space of the triplet consists of 3 × 3 × 3 = 27 states. To perform state aggregation, a rule must be formulated that maps each of the 27 states to one of the three aggregated states “1”, “2”, or “3”. This rule is defined by conditions that assign each of the 27 elementary states to one of the generalized states. The specific choice of rules depends on the nature of the devices forming the triplet and the characteristics of their interconnection. In this work, for a series configuration, the following intuitive rule is applied:
- −
The triplet is considered fully operational (100%) if all its elements are operational;
- −
The triplet is considered partially operational (50%) if at least one element is in this state;
- −
The triplet is considered completely non-operational (0%) if at least one element has fully failed.
The method is implemented through the following sequence of steps:
Determining the transition probability matrices for each of the three elements (it is advisable to use the Monte Carlo method in combination with expert evaluation).
Formulating the rules for aggregating the states of the triplet.
Constructing the matrix of elementary states: the rows correspond to the state indices (1 through 27), and each column lists the state values (1, 2, or 3) for each element of the triplet.
Computing the transition probabilities between the elementary states of the triplet, resulting in a 27 × 27 matrix (729 elements). The probability of transition between the elementary states State_i(s1, s2, s3) and State_j(t1, t2, t3) is computed as follows:
where
—is the probability of transition from state
sk to state
tk for
k-th element as specified in the corresponding matrix U
k.
- 5.
For each pair of aggregated states A→B, use the aggregation rule to construct the aggregated 3 × 3 matrix. For this:
- −
identify all indices i corresponding to the aggregated state A;
- −
identify all indices j corresponding to the aggregated state B;
- −
compute the transition probabilities in the aggregated matrix
- 6.
Perform normalization of the resulting aggregated matrix:
Normalization is required to ensure that the sum of probabilities in each row of the aggregated matrix equals one.
The output of the method is an aggregated transition-probability matrix that can be used for performing static or dynamic Markov analysis to compute dependability indicators of electrical systems.
The proposed TRICAM method differs from existing aggregation and decomposition approaches in that it is based on a three-state representation of system elements and on iterative structural aggregation with the possibility of adapting aggregation rules at each step of the analysis. Unlike conventional methods, where the aggregation scheme is typically fixed or selected heuristically once, TRICAM allows the aggregation rules to be modified depending on the structure and condition of the system at the current level. This provides methodological flexibility and enables an efficient reduction in the Markov model dimensionality while preserving the interpretability of system states and their connection to real operating regimes. As a result, TRICAM represents a practical and transparent tool for analysing the dependability of complex electric power systems.
One of the key advantages of the three-state cluster aggregation method is its scalability. The same computational algorithm can be applied regardless of the system’s decomposition level, which enables the construction of multi-level hierarchies when using the TRICAM method. Each triplet—whether composed of basic elements or previously aggregated subsystems—is treated within the aggregation method as three separate units with known probabilistic characteristics. This preserves the uniformity of the mathematical model across all stages of analysis.
The method maintains the Markov properties throughout all stages of dependability computation. Owing to the precise calculation of transition probabilities between elementary states and the subsequent clustering at the macro-level, the method ensures correct time-evolution behavior of the system without losing information about its stochastic nature.
The method is also highly flexible, as it can be adapted to both homogeneous and heterogeneous components, including systems in which components have different transition matrices. This makes it suitable for application in complex electrical systems.
Another advantage is the compact representation of data. Aggregated transition matrices have a fixed size of 3 × 3 regardless of triplet complexity, which significantly reduces computational costs during subsequent analysis. This is particularly important when moving to higher-level structures in TRICAM, where dimensionality reduction is a critical factor for preserving the feasibility of full Markov analysis.
In addition, the method offers a high degree of transparency, enabling interpretation of results in terms of the original component states. This opens the way for verification of the developed three-state cluster aggregation methods for assessing the dependability of electrotechnical systems.
4. Case-Study
Let us verify the three-state cluster aggregation method described in the previous section using the example of the solar power plant installed at the logistics hub “Kyiv Innovation Terminal of Nova Poshta”.
In 2024, a rooftop solar power plant (SPP) with a capacity of 1 MW was commissioned on the roof of Nova Poshta’s Kyiv Innovation Terminal. The generating modules are evenly distributed over an area of 5000 m
2, which enables partial coverage of the internal electricity consumption of the logistics complex [
59]. This SPP is one of the largest commercial rooftop solar installations in Ukraine.
Detailed technical parameters—such as the number and type of solar panels, inverter models, or the configuration of the energy storage system (ESS)—are not disclosed in open sources. However, the general structural layout of a solar power plant of this capacity is well known, which makes it possible to estimate the missing technical characteristics on the basis of this typical configuration. Taking into account construction and technical standards, as well as the parameters of electrical equipment, we can determine the constituent elements of this SPP and their quantities. This, in turn, allows the SPP scheme to be decomposed into triplets and enables the modelling of dependability indicators using the developed three-state cluster organization methods.
Since the aim of the study is a comparative evaluation of the accuracy of the TRICAM method and traditional Markov analysis, the technical parameters of the photovoltaic power plant are specified in the same manner in both methods. Consequently, the procedure used to determine these parameters does not have a significant influence on the results of the comparative accuracy assessment.
Let us estimate the number of solar modules for a rooftop SPP of this capacity. It is known that the installed capacity of the solar power plant on the roof of Nova Poshta’s Kyiv Innovation Terminal is 1 MW, and the area covered by solar modules is approximately 5000 m2. To estimate the number of solar panels that can be installed on this area, one should take into account the average power rating of a single solar panel as well as its dimensions.
Modern industrial-grade monocrystalline silicon solar panels typically have a power rating of 400–550 W and an area of approximately 1.9–2.2 m
2 [
60,
61]. Under rooftop installation conditions—where spacing between rows is required to avoid shading—the effective area coverage factor is usually 0.75–0.85 [
62].
Let us assume an average module power of 500 W and an area of 2.0 m
2. In this case, the theoretical number of installed solar panels
N can be estimated as follows:
where
S = 5000 m2—is the total roof area,
η = 0.8—s the coverage factor,
S1 = 2.0 m2—is the effective area of a single panel taking into account the tilt angle.
We obtain that the total number of panels is 2000, and the corresponding installed capacity of this number of solar panels is 1 MW, which is consistent with open-source information on the parameters of the SPP.
A simplified structural diagram of an industrial solar power plant has the following configuration (
Figure 1).
This diagram shows the relationships between the main elements of the SPP: the solar panel (or a block of panels), the energy storage system (ESS), and the inverter. Based on the parameters of these three devices, one can determine how many ESS units and inverters such an SPP may contain, as well as how many solar panels can be combined into a single block.
Under conditions of elevated risks of destructive impacts (in particular, due to attacks on energy infrastructure), the role of the ESS is not limited to smoothing generation fluctuations; it also ensures the dependability of the entire SPP. This is achieved by charging from the solar panels during hours of excess generation (daytime) and discharging during periods of deficit (night or cloudy conditions), as well as supporting peak loads in the evening or early morning
In modern industrial SPPs, it is typical to use three-phase string inverters with a capacity of 100 kW, for example the Huawei SUN2000-100KTL-M1 inverter with a nominal power of
PInv = 100 kW. The efficiency of such an inverter is
[
63]. Given the SPP’s total power CEC
P = 1000 kW, the total number of inverters required to support this capacity should be (rounded up):
As an ESS device, one may employ a Tesla Megapack 2 with a capacity of 500 kW·h and an efficiency
ηESS = 0.9 [
64]. In this case, 11 ESS units of this type provide a total capacity of 4988 kW·h, which can ensure the operation of the SPP for almost 5 h in the absence of external power supply. If the SPP comprises 2000 solar panels, they can be grouped into blocks. With a uniform distribution of panels across the blocks, each block will contain 182 panels.
According to the simplified structural diagram in
Figure 1, it can be seen that the panel (or block of panels), the ESS device, and the inverter form a triplet. In total, the SPP consists of 11 triplets, and each triplet includes: 182 solar panels, 1 ESS unit, and 1 inverter. This already makes it possible to apply the triplet model described in
Section 3 (Methodology) to assess dependability indicators and verify these estimates. However, to do so, it is necessary to construct the transition-probability matrix between the states of the triplet. We will construct this matrix on the basis of calculations, or more precisely, on an approximate estimation of the probability that each element of the triplet is in a state of full operability (100%), partial operability (50%), or complete failure after a destructive impact—namely, the strike of a “Shahed-136” drone on the building on which the SPP is installed.
4.1. Estimation of Transition Probabilities for a Solar Panel Block
To estimate the transition probabilities between the three states for a random block of solar panels in the event of a “Shahed-136” drone strike at a random point on the roof, it is appropriate to use the Monte Carlo method. In each random trial, it is necessary to know the distance from the explosion epicenter at which the damage will be high, moderate, or minor. These cases correspond to the block states “inoperable (0%)”, “partially operable (50%)”, and “operable (100%)”, respectively.
This information is fully provided in FEMA 426 (“Reference Manual to Mitigate Potential Terrorist Attacks Against Buildings”), developed by the U.S. Federal Emergency Management Agency (FEMA) [
63]. FEMA 426 is devoted specifically to the protection of buildings against explosions. This document presents empirical formulas for calculating overpressure resulting from an explosion as a function of the explosive mass (TNT) and the distance from the explosion epicenter. In
Figure 2, a graph is plotted showing the dependence of overpressure on the distance from the epicenter of an explosion equivalent to 50 kg of TNT carried by the Shahed-136.
From this graph, the corresponding distances from the explosion epicenter can be determined. Let us assume that a distance of 8.8 m or less from the epicenter defines the zone in which the block completely loses its operability (0% operability). In the zone from 8.8 m to 25 m, the panel block is considered to have partial operability (50%), since, according to FEMA data, at an overpressure of 0.2 psi glass surfaces may be shattered. A distance greater than 25 m from the epicenter is regarded as safe, and panels located at such distances are assumed to remain fully operational. In
Figure 2, these zones are shown in red, yellow, and green, respectively.
Based on the geometric data from [
59], the frequency (probability) of a random panel block being in each of the states (full operability 100%, partial operability 50%, complete inoperability 0%) under a random strike on the roof of the building by a single “Shahed-136” attack drone with a 50 kg TNT-equivalent warhead has been calculated. The calculation results are presented in
Figure 3.
Given that 1,000,000 trials were carried out using the Monte Carlo method, these frequencies can be regarded as the probabilities that a random solar panel block transitions from the fully operational state to each of the three described states.
To construct the complete transition-probability matrix, it is necessary to take into account additional conditions, which, for example, may be as follows:
- −
When the block is in the state of partial operability (state 2), the system returns to full operation (state 1) with probability 0.5;
- −
In the case of complete failure (state 3), the system is restored to state 1 with probability 0.1.
In the absence of further constraints, we assume a logical distribution of the remaining probabilities between staying in the current state and transitioning to a degraded state. In particular, for transitions from state 2: transition to state 1: 0.5 (given condition), remain in state 2: 0.4 (probability of partial preservation of functionality), transition to state 3: 0.1 (deterioration). For transitions from state 3: transition to state 1: 0.1 (given condition), transition to state 2: 0.1 (partial recovery), remain in state 3: 0.8 (probability of staying in the failed state).
The transition probabilities of the basic components were determined using structured expert judgement. The expert group consisted of five specialists in the design and operation of photovoltaic power plants, as well as a specialist in the field of critical infrastructure security.
Thus, the complete transition-probability matrix for the first element of the triplet—the solar panel block—has the form:
In the constructed transition probability matrix, destructive events and recovery processes differ in their physical nature. The first row of the matrix corresponds to transitions between system states caused by an external destructive impact (a single drone strike), whereas the subsequent rows characterise the probabilities of transitions associated with repair and recovery processes. Such a combination is correct, since the transition matrix describes not the events themselves, but the changes in the functional states of the system depending on its current state. The different nature of destructive and recovery influences is reflected through the structure and numerical values of the matrix elements, which does not violate its stochastic properties and is consistent with the generally accepted approach to representing failure–recovery processes in matrix form.
The transition-probability matrix for the first element of the triplet has been derived. We now need to estimate the corresponding probabilities for the electrotechnical equipment in the triplet—namely, the inverter and the ESS device.
4.2. Estimation of Transition Probabilities for ESS Devices and Inverters
Let us now model the transition-probability matrices between the states of the SPP’s electrotechnical equipment—namely, the inverters and ESS devices. According to [
62], the Tesla Megapack 2 ESS is housed in a 20-foot ISO-type container installed on a concrete foundation, without windows or openings. The container has a metal shell (steel panels with a thickness of 4 mm), with battery blocks and power electronics located inside. It features a hermetic design and can be installed on open sites.
According to the typical overpressure levels given in FEMA 426, at an overpressure of 8 psi, partial destruction of thin-walled metal structures occurs, while at levels above 12 psi, complete destruction takes place. Thus, it may be assumed that the second component of the triplet—the Tesla Megapack 2 ESS—remains fully operational at overpressures below 8 psi, transitions to a state of partial operability in the range from 8 to 12 psi, and completely loses operability at pressures of 12 psi and above. According to the graph in
Figure 2, an overpressure of 8 psi for an explosion equivalent to 50 kg of TNT corresponds to a distance of 7.9 m from the explosion epicenter, while at a distance of 6.9 m the pressure reaches the critical value of 12 psi.
The layout of the Tesla Megapack 2 ESS blocks at the installation site is shown in
Figure 4. The same figure also presents the damage radius resulting from the detonation of the warhead of a “Shahed-136” drone. The example of the layout scheme for the Tesla Megapack 2 ESS devices is taken from the official website of Tesla Inc. (Austin, TX, USA) [
64].
Taking into account the dimensions of the ESS devices (7 × 1.6 m) and the inverters (1.1 × 0.7 m), as well as their layout, it is possible to estimate the size of the installation site. It is approximately 35 m in length and 7 m in width. These data, together with the device dimensions, make it possible to calculate the coordinates of the centers of each specific type of equipment (ESS units and inverters separately), which are required to model the frequency of states of any randomly selected device under a random strike of a drone on the installation site.
In
Figure 4, the zones of complete and partial damage are shown in red and yellow, respectively. The zone in which the probability of damage is close to zero is indicated in green.
We now calculate the state frequencies for the ESS devices using the Monte Carlo method. According to the results, the frequency of ESS units remaining undamaged is 0.572. The frequency of being in a state of partial operability is 0.233, while with a frequency of 0.195 the ESS devices completely lose their operability. The comparatively low frequency of ESS being in a partially operable state is explained by the very narrow strip of the yellow zone, as shown in
Figure 4. These values form the first row of the transition-probability matrix for the ESS devices. The remaining rows are obtained by assessing the maintainability of the Tesla Megapack 2 ESS.
The Tesla Megapack 2 ESS is a high-tech device containing a large number of components of foreign manufacture [
65,
66]. Most of its volume is occupied by battery modules, which, even when slightly damaged, are not subject to repair. Therefore, we assume that in the case of partial loss of operability, the ESS remains in the same state with probability 95%. In the case of complete loss of operability, only replacement of the unit is possible, with a probability of 1%. This makes it possible to derive the full transition-probability matrix for the second element of the triplet—the Tesla Megapack 2 ESS devices:
An analogous calculation for the Huawei SUN2000-100KTL-M1 inverters [
67] results in the following transition probability matrix:
Here, it is taken into account that, unlike the Tesla Megapack 2 ESS, the Huawei SUN2000-100KTL-M1 inverters have a higher degree of maintainability. Most of their electronic components are readily accessible and have many available analogues. Therefore, the probability of fully restoring an inverter from the 50% operability state is taken as 60%, while recovery from complete inoperability to full operability is assumed to be 20%. In turn, the probability of restoring the 50% operability state from complete failure is taken as 40%.
4.3. Verification of the Method Using the Example of a Solar Power Plant
One of the key tasks in implementing the new TRICAM method is establishing its validity and ensuring consistency with the results of the classical approach—Markov analysis. For this purpose, verification must be performed to confirm the correctness of the aggregated transition probabilities.
The TRICAM method aggregates the states of a triplet from 27 elementary states into 3 aggregated ones, which significantly reduces the dimensionality of the state space and the computational complexity of the problem. However, such aggregation creates challenges for direct comparison with classical Markov analysis, as there is a mismatch in the number of states: the classical model operates with the full set of 27 elementary states, whereas the aggregated TRICAM model includes only three generalized states—full operability (100%), partial operability (50%), and complete failure (0%).
To ensure an objective assessment of TRICAM’s accuracy, it is necessary to select a reference state that is common to both models—the classical and the aggregated one. The only such state is the state of full operability of all three components of the triplet, i.e., the situation in which all elements are functioning at 100%. This is the only state that uniquely corresponds to the aggregated state “full operability” in the TRICAM model and simultaneously has a precise counterpart in the classical model as one of the 27 elementary states.
Thus, the following verification procedure is employed in this study:
- −
Compute the probability of the aggregated state “Full operability” using the TRICAM method after one or several steps of the Markov process.
- −
In parallel, compute the probability of the elementary state “Full operability” for all components of the triplet using the classical Markov analysis without aggregation.
- −
Compare the obtained values, which correspond to the same actual state of the system.
If a high degree of consistency between the results is observed, the TRICAM method can be considered to adequately approximate the system’s behavior with respect to the evaluation of the critically important state of full operability. Such a local verification step is a justified approach for confirming the accuracy of the method in situations where global comparability of states is impossible due to the very nature of aggregation. This approach not only makes it possible to assess the correctness of the constructed aggregated model but also offers a universal verification criterion suitable for any methods that perform dimensionality reduction in the state space.
5. Results of Verification
Calculations for the state in which all elements of the triplet are fully operational are presented in
Table 2. The table shows the first five steps of the TRICAM method and the traditional Markov analysis.
The results of numerical modelling indicate that already after the first step of the analysis, the probability that the triplet remains in a fully operational state decrease sharply. If the initial value equals 1 (complete reliability at the start), then after the first step it falls below 19%, and in subsequent steps approaches values on the order of 0.02–0.04, which are almost negligibly small.
In such a situation, using the probability of full operability as the primary verification metric leads to several methodological difficulties:
- −
First, the absolute values become extremely small, which reduces the sensitivity of the metric to changes and complicates interpretation of the results;
- −
Second, the probability of full operability is not additive and does not fully capture the behavior of a degrading system;
- −
Third, in real engineering systems, the most relevant and critical quantity is not the probability of perfect operability, but the probability of system failure, i.e., the situation in which at least one component fails.
Given this, it is advisable to use the inverse quantity—namely, the probability of the inoperable state, as shown in Columns 4 and 5 of
Table 2. This approach offers several significant advantages:
- −
The probabilities of failure increase over time, which corresponds to the intuitive understanding of accumulated failures in a system;
- −
The metric is more sensitive to changes in degradation dynamics, especially during the first steps of the analysis;
- −
When the probabilities of full operability are very small, differences between the models in absolute values may be nearly indistinguishable; in contrast, comparison based on the probability of failure allows for a more accurate assessment of deviations.
In addition, another important engineering consideration must be taken into account. In practical engineering calculations, probabilities on the order of 0.04 and 0.02—although differing by a factor of two—are effectively regarded as equally small and are classified as failure modes. This scale insensitivity to small values is typical in technical diagnostics, where the fact of transitioning into an unreliable state is more important than the exact numerical value of the probability.
For this reason, the probability of failure more accurately reflects the functional state of the system in the context of practical applications of reliability models.
Considering the above, for the purposes of verification, system reliability assessment, and degradation process modelling, it is advisable to use the probability of loss of operability, as it is more informative, more sensitive to changes, and more interpretable from an engineering perspective.
The conducted study allows us to conclude that the TRICAM method has demonstrated high accuracy in modelling the aggregated state probabilities of the system. Particularly accurate results are observed during the first steps of the Markov analysis, where the discrepancy with the classical model is virtually zero. At later steps, the error remains within 1–2%, which is fully acceptable for engineering analysis and practical dependability calculations.
8. Discussion
The obtained results demonstrate that the proposed TRICAM method is a promising tool for the quantitative assessment of the dependability of complex electric power systems in the context of sustainability. Its application enables a substantial reduction in the computational complexity of Markov analysis without sacrificing accuracy, thereby creating prerequisites for the implementation of practical real-time reliability monitoring methods. This is of direct relevance to the resilience of smart-city energy infrastructures, where stable power supply, recoverability after failures, and optimal resource utilization constitute the foundation of both energy and societal resilience.
It should be noted that the proposed TRICAM methodology is aimed at the analytical derivation of reduced-dimensional Markov models, including transition probability matrices and stationary characteristics of system states. In the present work, the primary focus is placed on structural reduction and methodological aspects of constructing such models, rather than on the computation of specific numerical dependability indicators. In this context, Monte Carlo methods may be considered as an additional tool for statistical validation and for assessing the uncertainty of results under variations in the model input parameters. The implementation of such simulation-based analysis is regarded as a promising direction for future research.
From a practical perspective, the obtained stationary probabilities of the system states can be used not only for the qualitative interpretation of dependability, but also for the evaluation of standard quantitative indicators of power supply reliability and resilience, in particular the System Average Interruption Duration Index (SAIDI), which characterises the average duration of supply interruptions experienced by consumers over a specified observation period. For this purpose, after forming the system transition matrix, further analysis is performed within the framework of a stationary (static) Markov approach.
At the first stage, based on the transition matrix, the stationary distribution of the three aggregated states is determined, where each value is interpreted as the average fraction of time during which the system, under long-term operation, resides in the corresponding operational state.
At the second stage, the stationary probabilities are used to estimate the average fraction of time during which consumers experience insufficient power supply. For the three-state aggregation (100%, 50%, 0%), this quantity can be computed as
where the coefficients reflect an approximate level of loss of the system’s primary function in degraded states.
At the third stage, the SAIDI indicator is obtained by scaling the fraction of time with insufficient supply
D by the selected observation period
T:
Thus, the system transition matrix serves as the initial basis for the consistent derivation of stationary state probabilities and the subsequent calculation of integral indicators of power system reliability and resilience.
The solar power plant considered in this study is selected as an illustrative case intended to demonstrate the operating principles, computational advantages of the proposed TRICAM method, and the assessment of its accuracy. The purpose of this example is not to perform a comparative analysis of different system configurations or scales, but rather to provide a clear illustration of the structural decomposition and state aggregation procedure. It should be emphasised that the TRICAM method is not tied to a specific size or type of electric power system and is equally applicable to systems of different capacities, topologies, and component compositions. A comparative analysis of systems of varying scale and configuration represents a natural direction for future research and may be carried out in subsequent studies.
It should be noted that hierarchical state aggregation may potentially lead to the accumulation of approximation errors when moving between successive levels of the model. In the TRICAM method, this effect is limited by maintaining a fixed number of aggregated states at each level, preserving the stochastic consistency of the transition matrices, and applying logically interpretable aggregation rules. In addition, comparison with the results of conventional Markov analysis indicates that the resulting deviations do not have a significant impact on the final assessment of system dependability. A more detailed investigation of error propagation effects in multi-level aggregation is considered a subject for future research.
In the broader context of sustainability, dependability can be viewed as the technical backbone of energy stability—it ensures system reliability and adaptability in combination with economic efficiency and environmental balance. The integration of TRICAM into energy-network monitoring systems enables the assessment of failure risks, planning of technical maintenance based on predicted component degradation, and informed decision-making aimed at reducing energy losses and minimizing the carbon footprint. This is particularly important for distributed energy systems that combine local renewable energy sources, storage units, and consumers within a unified digital network.
In such distributed power systems, the TRICAM method can be used for multilevel dependability assessment—from individual nodes (solar panels, ESS units, inverters) to clusters of microgrids. Its ability to preserve the Markov structure and maintain scalability enables analysis even in large decentralized systems without exponential growth in computational costs. This makes TRICAM highly suitable for evaluating the sustainability of distributed systems, in which supply stability, energy efficiency, and the capacity for local autonomy function as key indicators of system robustness.
Thus, the new methodology provides a scientifically grounded basis for strategies aimed at integrating renewable energy sources, enhancing grid flexibility, and strengthening the energy security of regions.
At the same time, despite the positive results, the proposed methodology has certain limitations. First, TRICAM is oriented primarily toward the technical aspects of reliability and does not account for cybersecurity factors, which become critically important in the context of the digitalization of the energy sector. Modern smart grids integrate IoT devices, sensors, remote controllers, SCADA systems, and cloud services, all of which significantly expand the surface of potential cyber-threats. Cyberattacks on energy systems may lead not only to short-term disturbances but also to cascading failures and prolonged outages, which directly affects the sustainability of energy infrastructure. Studies show that without integrating cybersecurity into reliability models, it is impossible to ensure the full resilience of an energy system, even when its technical performance indicators are high [
69].
Second, the methodology requires high-quality input data—failure statistics, transition matrices, and reliability parameters. In real conditions, obtaining such precise data may be challenging, particularly for heterogeneous or newly deployed technological systems that lack operational history. Third, the practical use of TRICAM requires highly reliable software capable of performing computations without errors and with an appropriate level of data protection, since a malfunction in the code or unauthorized access may distort the analysis results and lead to incorrect managerial decisions.
In future research, it would be advisable to extend the TRICAM framework by integrating it with machine-learning methods for automated estimation of dependability parameters, as well as with technologies of cyber-physical resilience. This would enable the incorporation of technical, informational, and cyber-resilience into a unified system. Such integration will become an essential prerequisite for ensuring the energy sustainability of smart cities and regional power systems within the broader context of sustainable-development policy.