1. Introduction
Electrical grids transmit electrical power from generation stations to load centres and are usually vulnerable to severe weather events such as floods [
1], earthquakes [
2], cyclones [
3], ice [
4,
5,
6], typhoons [
7,
8], and hurricanes [
9,
10,
11]. These severe weather conditions have proved to cause substantial damage to power system infrastructure and often lead to total system power cuts and the imposition of high economic losses resulting from restoration, recovery, and reconstruction costs. Malawi is one of the countries which has been hit by tropical cyclones [
12]. As shown in
Figure 1, the country is landlocked, bordered by Tanzania (north/northeast), Mozambique (east/south/southeast), and Zambia (west). Shire River is the largest river in Malawi, on which about 90 percent of hydropower plants are cascaded.
Figure 1 also shows the power generation plants and existing and planned transmission network, as of November 2022. Some transmission infrastructures are steel, while others are wooden.
Over twenty catastrophes linked to extreme rainfall occurrences, such as floods, landslides, and storms, have happened in the last decade only and have shown an incremental ascending pattern in the number of populations affected. These frequent disasters inflict substantial costs for repairs and reconstruction on the nation, redirecting insufficient resources from other development requirements. The effects of the 2019 Tropical Cyclone Idai (TCI) put Malawi in the top five countries globally most affected by severe weather events [
12]. The 2022 Tropical Storm Ana (TCA) and Tropical Cyclone Gombe (TCG) resulted in 64 fatalities and 945,934 people involved. More recently, the 2023 Tropical Cyclone Freddy (TCF) affected over two million people, including over 650,000 displaced, 679 deaths, 537 people missing, and 2186 injuries [
12]. Failures caused by tropical cyclones are expected to worsen due to climate change [
13]. In January 2022, the TCA caused long hours of national blackout due to lost power generation and transmission systems. The Kapichila hydropower plant was severely damaged [
14,
15], and many transmission towers along the Shire River were brought down, overwhelming the mitigation measures that were put in place [
16]. Operational challenges have been evidenced by multiple total power system shutdowns even when there were no extreme events. A detailed study on the resilience of Malawi’s grid operator is reported in [
17], where operational, technological, regulatory, political, financial, and management issues were identified as some of the factors affecting the response capabilities. The issues discussed above underscore the need to move from response to long-term resilience [
18].
The concept of resilience was first defined ecologically by Holling [
19] as the capacity of systems to return to their original state after severe system disruptions. Over the years, this resilience concept has moved from ecology to power systems. Power systems resilience (PSR) has been defined differently [
20,
21]. All the definitions describe resilience capacities, including preparedness (prevention), absorption, restoration, adaptation, and sometimes transformation, which have been reviewed and explained in [
20,
22,
23,
24,
25,
26].
The assessment of PSR has increasingly relied on a diverse array of indicators that quantify the capacity of the network to withstand, adapt to, and recover from adverse events. Different studies have presented a range of indicators [
26]. Ahmadi and others [
26] presented a set of varied measurable indicators for various energy systems alongside their formulations. It is worth noting that there was no uniformity in these resilience indicators. The resilience trapezoid and other indicators were classified as numerical measures in [
21]. A resilience trapezoid has also been considered an alternative way to define PSR [
25] and a quantitative resilience indicator [
21]. One of the quantitative frameworks covering operational and infrastructural resilience is that proposed by Panteli [
27], which systematically captures the performance of a power system during phases of the resilience trapezoid, called
, which is pronounced as FLEP.
(or F) defines the rate of resilience degradation while
(or L) characterises the extent of degradation during event.
shows the duration in the degraded state and
(or P) quantifies the recovery rate. Since its introduction, few studies [
27,
28,
29,
30,
31,
32] have used the FLEP metric system to quantify resilience. However, the review of the literature in
Section 2 found the following literature gaps, which form the basis of this paper:
The existing FLEP metric only quantifies absorption, adaptation, and restoration (recovery) features for the critical resilience characteristics.
Existing studies that applied the FLEP metric framework did not integrate the framework, its indicators, or the essential resilience capacities.
Infrastructure quantitative indicators with respect to hydro-based generators have not been explored.
Although Malawi is exposed to severe weather-related disruptions and while resilience challenges are location-specific, resilience studies on Malawi’s power system have not been conducted.
If these gaps are not addressed, there could be a risk of only partially solving resilience challenges, for example, concentrating on corrective strategies, emergency responses, and damage assessments while unintentionally leaving out asset management and system upgrades [
33]. In addition, resilience challenges are location-specific, and context is very important in resilience buildings. As one way of ensuring quick restoration following a disruption, Cesar [
34] showed that it is important for grid operators to develop resilience strategies which classify hazards and devise preparatory measures. These strategies are usually context- and time-based because resilience improvement strategies are not one-size-fits-all [
35]. To address the gaps identified above, this paper, therefore, aims to (i) develop an extended metric framework which can quantify all the basic (standard) resilience capacities including preventive, absorptive, adaptive, restorative, and transformative, (ii) provide an integrated analysis of the proposed framework and how it relates to its resilience indicators and capacities, (iii) propose a generator-based infrastructure indicator, and (iv) apply the proposed methods (framework) on Malawi’s transmission network.
In this work the existing FLEP metric framework is modified to the novel AFLEPT, called , which is pronounced as AFLEPT. (or A) assesses the preventive ability before the event, i.e., providing the status of the resilience indicator before the event, while maintained the original definitions. (or T) estimates the power system’s transformative capacity. An integrated analysis of the AFLEPT framework, its indicators, and resilience capacities are also conducted, demonstrating a comprehensive, integrated framework for assessing power grid resilience (which accounts for all the resilience phases of a resilience trapezoid). Further, a performance evaluation of Malawi’s transmission grid during the 2022 TCA is performed using DigSILENT PowerFactory (PF) 2023 model SP5 (×64) software. Conducting this assessment on a real power system with actual power system data, using industry-based software, presents a platform for solving real resilience challenges.
The AFLEPT metric provides a comprehensive view of performance over a specified period. It can allow utilities and grid operators to measure their performance against regulatory standards and benchmarks, facilitating compliance, encouraging continuous improvement in resilience and offering support in decision-making processes. However, its accuracy may depend on the availability and quality of input data. Moreover, its effectiveness may vary depending on the application scale, potentially limiting its utility in certain contexts. This approach enables the evaluation of both the operational aspects and the inherent capabilities of a power grid to withstand, adapt to, and recover from severe disturbances. This combination aims to complement performance metrics and underlying capabilities. Thus, a mixed approach ensures that both the symptoms of resilience weaknesses, as highlighted by the AFLEPT metrics, and the underlying capabilities, as highlighted by resilience capacities, to address these weaknesses are considered, resulting in more informed and effective resilience-improving techniques. The integrated framework further provides a valuable tool for planning, operating, and managing resilience within power systems. It also offers a practical resource for policymakers, reflecting on the potential implications of resilience strategies. Finally, it serves as a guideline for those responsible for managing the electricity grid, offering a roadmap to validate their resilience strategies. It is interesting to investigate if the power system performance curve for real power systems using actual power system data could produce a similar pattern to that produced by simulated data and using test versions of power systems.
Thus, the main contributions of this paper are as follows:
New methods: an integrated framework for assessing power grid resilience and a new quantitative indicator.
New work: resilience evaluation of Malawi’s transmission grid to tropical cyclones.
Context-based infrastructural and operational resilience enhancement strategies.
2. Literature Review
The definition of PSR is derived from definitions provided by other disciplines [
21]. In the context of PSR definitions, the words “grid,” “infrastructure,” “network,” “power system,” and “system” are used interchangeably. The definition used in this paper is adapted from Malin [
35]: the capacity of an interconnected system—comprising components, institutions, and operators—to proactively plan, prepare, cope with threats, respond, and transform, lessening impacts, allowing fast recovery, promoting adaptive improvements, reducing risks and vulnerabilities, and addressing current and future threats. This definition considers all the periods of the resilience trapezoid and all the standard resilience capacities (preventive, absorptive, adaptive, restorative, and transformative). The assessment of PSR has increasingly relied on a diverse array of indicators that quantify the ability of the grid to withstand, adapt to, and recover from adverse events. Different studies have presented a range of indicators [
26]. Ahmadi and others [
26] presented a collection of different quantitative indicators for various energy systems alongside their formulations. It is worth noting that there was no uniformity in these resilience indicators. Indicators were selected to reflect the vulnerability of the system and its resilience, which may include ecological, physical, social, economic, institutional, and infrastructure variables [
36]. The chosen indicator should align with the system’s purpose (systems’ functionality), be tied to the system’s unique features (characteristics), and be easy to understand and apply (simplicity). In addition, indicators should detect meaningful changes (sensitivity), accurately measure the intended attribute (validity), and provide ease of obtaining data (accessibility). Further, resilience indicators should directly tie to system goals and stakeholders (relevance) and be reliable under varying conditions (robustness). Furthermore, selected indicators should produce consistent results when replicated (reproducibility) and be able to cover relevant system aspects (scope). Moreover, when choosing indicators, it should be established if data are available and can be obtained (availability). Finally, indicators should be cost-effective to measure (affordability) [
36].
The resilience trapezoid and other indices were presented as quantitative indicators in [
21]. A resilience trapezoid has also been considered an alternative way to define PSR [
25] and a quantitative resilience indicator [
21]. The resilience trapezoid visualises the capacities of power systems throughout the phases of disruption, i.e., represents the resilience level with respect to time [
35], as shown in
Figure 2. The time of the resilience trapezoid in the
X-axis could be in seconds, minutes, hours, days, weeks, or months. In this example, the event is assumed to be in days.
is day 0, before the event strikes.
is the day that the event strikes, while
is the day the event ends. Restoration or recovery starts on day
and ends on day
.
is the day when transformative work ends.
The
Y-axis shows the measured quantity (the resilience capacity), such as the grid functionality [
20], performance level [
5,
28,
37], resilience level [
38], resilience indicator [
39], system function [
21], power system status [
5,
7], performance index [
40], resilience [
37], and system performance [
23]. Although different terminologies have been used, the
Y-axis represent a suitable resilience indicator, which can include the number of transmission lines online, the number of affected customers, the load energy unserved, the maximum number of online units, the amount of generation capacity, and the load demand. The units of measure may be represented as a percentage or absolute value, for example, megawatts to represent the amount of power consumption.
Before the event, the resilience level is assumed to be adequate from
to
. This stage represents the pre-event stage and has been associated with the preventive, mitigative, and anticipative resilience features. In this phase, the system’s regular operation has been assumed [
41,
42]; the resilience indicators are supposed to be at their total capacity. The system is also considered to be in its primary operation mode (intact and functional) [
23]. However, this may not always be the case. Other studies have claimed that in this stage, systems are robust [
37], resilient [
43], reliable, stable [
20], reconditioned [
37], and prepared for impending grid events [
23]. Preparing for an impending disaster may mean having an adequate generation reserve (thus, the percentage of the generated capacity to operational installed capacity,
, leaves enough reserve), redundant transmission lines, emergency generators, a prepositioned repair crew and materials [
42], preventive maintenance, and non-exposed infrastructure. Although this stage has been associated with robustness and resistance, it can be more accurately described as preparation, anticipation, and prevention whose effectiveness can be observed when an event strikes. Thus, this phase is associated with preparedness capacity indicators. Some of the resilience activities in this phase may include resilience planning, damage forecasting, impact estimation, the prepositioning of resources such as a repair crew and materials, and system monitoring [
21,
28]. Cesar [
34] indicated that the planning strategies could take two approaches—when there are no disasters and when there is an upcoming disaster.
When a severe event strikes on day
, the system functionality drops or starts to drop. How fast or how low the system functionality drops depends on how robust the system is, thus the ability of the system to absorb or resist the impacts of severe events. By implication, the rate of degradation may manifest the system’s absorptive (/coping/resistive/survivability) capacity [
5,
7,
23,
37,
38,
44]. Depending on the level of the absorptive capacity, the system can lose part of or its total capacity (how low the resilience indicator in the
Y-axis drops). How the system responds to the extreme event manifests the system’s vulnerability, fragility, and survivability [
20,
37,
40], but also the intensity of the event [
42]. Thus, the rate and magnitude of the decrease in functionality, or degradation, have manifested the system’s vulnerability, survivability, or absorptive capacity. By extension, the indicators that measure the rate and magnitude of system degradation are mapped to this absorption phase. However, the decline in functionality may sometimes be masked by adequate alternative resources, such as the generation reserve, for example, in terms of number of affected customers. The time it takes for the system to remain at the reduced capacity depends on the prevailing weather conditions and the preparedness, adaptive, recovery, and restorative capabilities. As resilience is usually associated with severe events, the power grid operators may only monitor the disaster and damages to ascertain the system health and update the stakeholders. Operators can also start planning for restoration based on preliminary monitoring reports to quicken the recovery process [
34].
to
has been reported to be a response phase [
23,
37,
44] where most systems have been claimed to be degraded [
5,
7,
28,
37,
39,
45], disrupted [
20], or be in an extremis (critical) state [
41]. The duration that the system remains degraded may imply the system’s fragility, agility, and the level of redundancy measures in place [
20,
37,
44,
45]. This transition phase has also been considered an adaptation stage where the event experiences are used as learning points to moderate future occurrences [
38]. Thus, the adaptive capacity, duration of a system between the time that the event stops to the time that restoration starts, and their associated indicators have been connected to this phase. Zidane [
42], however, described the adaptation stage as a long-term preparatory stage and that it starts when restoration stops until the next impending event. While it is true that there is long-term learning and planning in the adaptive strategy, this learning and planning actually starts before restoration commences, i.e., the period when the system is in a degraded state. Assessments and observations may be conducted to evaluate the extent of the impact, which may guide the emergency response, if need be, resource mobilization, and recovery plans [
5]. The strategies made during the event transition phase may be assessed, re-evaluated, and approved to optimize the resources [
34].
to
demonstrates the recovery [
20,
23,
37,
38,
40,
44] or restoration [
5,
7,
39,
41,
45] stage. Although restoration and recovery have been used differently, they all aim to ensure functionality is reinstated. The restorative ability is responsible for the rate at which this restoration takes place. Thus, recovery or restoration rate is associated with the recovery phase. The speed with which the system is restored to its original status depends on several factors, including the availability of repair materials, the maintenance crew, the accessibility of the affected infrastructure, weather conditions, and financial resources. Although only restoration and recovery have been associated with this stage, declaring the system obsolete or unrepairable, and thus scrapping it, may sometimes occur. Zidane [
42] indicated that restoration is bi-dimensional—either the speed of functionality recovery or the portion of the functionality or equipment that is restored. In this paper, both dimensions are accounted for in the analysis.
After restoration, the system has been claimed to have recovered and is in a stable [
20], normal [
5,
7,
41], transformed [
38], or an ultimate operational mode [
23]. The system may be improved or transformed based on lessons learned from the extreme event. Instead of reverting to the original state, the system achieves a transformed, new normal or a better state (building back bigger, stronger, and better). Even though Malin [
35] did not associate transformation to increase but any form of change, the change that is being considered in this context is an increasing transformation. Consequently, the increase in resilience indicators can be mapped to the transformative capacity and phase
to
. Thus, severe environmental conditions can sometimes be used as the basis for building the systems back better [
46]. This is when the systems’ functionality or capacities are increased. It may depend on finances, politics, technology, and human resources.
Assessing the resilience of power systems to severe weather conditions has always been challenging due to the unavailability of standard assessment frameworks, indicators, and metrics. One of the quantitative frameworks covering operational and infrastructural resilience is that proposed by Panteli [
27], which systematically captures the performance of a power system during phases of the resilience trapezoid, called
, which is pronounced as FLEP.
(or F) defines the rate of resilience degradation while
(or L) characterises the extent of degradation between
to
,
shows the duration in the degraded state between
to
, and
(or P) quantifies the recovery rate between
to
. A fifth area metric, which depends on all four metrics, was also proposed in [
27]. A resilience trapezoid shown in
Figure 2 demonstrates how the FLEP metric relates to the resilience transition phases. The figure demonstrates that the existing FLEP metric only defines absorption, adaptation, and restoration (recovery) features for the critical resilience characteristics. Moreover, existing studies on applying the FLEP metric framework do not integrate the framework, its indicators, and the essential resilience capacities. These gaps are covered in this paper.
Since its introduction, few studies [
27,
28,
29,
30,
31,
32] have used the FLEP metric system to quantify resilience. In these studies, different severe weather conditions were presented, all simulated, as shown in
Table 1. Furthermore, operational and infrastructural indicators have been used, although some case studies [
29,
31] only used operational indicators. However, it is noted that the transmission lines [
27,
28] and distribution feeder circuits [
30] have, so far, been used as infrastructure indicators. In contrast, the amount of generation and load connected were used as operational indicators. Furthermore, a test version of the Great Britain (GB) transmission network was used. Only part of the distribution network was used for a real power system (Minnesota). Ref. [
29] discovered that the FLEP metric is inversely proportional to the area of the resilience trapezoid but directly proportional to the resilience capacities. Hossain et al. [
29] validated the resilience quantification using the FLEP metric by the resilience capacity metric. Kemabonta [
30] proposed a syncretistic framework for grid resilience and reliability. Although Kemabonta [
30] claimed to have combined FLEP and resilience capacities to assess the power system’s resilience, this was not demonstrated. The resilience capacities were explained in resilience enhancement, that is, how specific enhancement techniques could potentially enhance the capacities.
The quantitative resilience evaluation has frequently been centred on the quantification of system performances. So far, different quantitative resilience frameworks have been reported in the literature [
7,
8,
21,
25,
39,
40,
43,
44,
45,
47,
48,
49,
50,
51,
52,
53,
54], from where numerous conclusions or recommendations were drawn. There was no uniformity and consistency in frameworks as there has not been a standard metric or framework [
34]. While some start with threat identification and/or characterisation [
7,
43,
44,
50,
53], others start by defining the resilience goals [
47,
52]. Others begin by defining data requirements [
8,
21,
45,
48,
55] and the resilience metrics [
40]. There are limited studies on pre-event resilience assessments (preparedness). One framework demonstrated the need for planning resilience, which helps identify weak points and informs planning and operational decisions [
8]. The resilience frameworks depend on the location because events are area-specific. Although resilience studies are location-specific, and while Malawi is exposed to severe disruptions, resilience studies on Malawi’s power system have not been conducted. These extreme disasters have been classified with different return times for mitigation purposes, and these return periods are usually location-specific. For example, the Malawi 2015 floods were classified as a 1 in 500-year event [
18], while the 2016 drought was classified as a 1 in 35-year event [
56]. While the 2022 TCA was associated with a 1 in 50-year return period [
57], the 2023 TCF could not be classified in terms of the return period. However, the TCF reached the equivalent intensity of a category five hurricane at its peak [
58]. The impacts of tropical cyclones or hurricanes have been evaluated in the United States [
13], India [
59,
60], Republic of Korea [
61], and China [
3]. Ref. [
47] demonstrated that resilience frameworks are not one-size-fits-all methods by developing a framework in a developing country setting, using Uganda as a case study, to facilitate sustainable development. They argued that most frameworks, which are atomistic in their classifications of indicators, failed to demonstrate how local actions contribute to globally defined sustainable development targets. This is why it is suggested that resilience enhancement should incorporate the locality of resilience challenges by conducting location-specific assessments.
5. Conclusions and Future Research Plans
This study proposed an integrated resilience framework for assessing the resilience of electrical grids. The framework integrates the resilience indicators, AFLEPT metrics system, and the five standard resilience capacities to show a comprehensive resilience assessment. Malawi’s real power system, with real power system data, was used as a case study to apply the method by evaluating its resilience to the 2022 TCA. This was run in DigSILENT PowerFactory 2023 SP5 to provide a true picture of the actual system. The active power generated and consumed were used as the operational resilience indicators, while the number of transmission lines brought down and hydroelectric generators out of service were used as the infrastructural resilience indicators. The performance of these indicators over time was observed by running a QDS before, during, and after the TCA. The resilience analysis was performed by comparing the obtained time-dependent indicators with the predefined thresholds. Further, these were evaluated by establishing how the indicators, AFTEPT metrics, and resilience capacities were related. For indicators which fell short of the thresholds and had a decreasing relationship with the AFLEPT metrics and capacities, infrastructural resilience challenges that cause the deficit were identified which informed resilience improvement planning. The results suggest that Malawi’s power system is vulnerable to extreme power system events (tropical cyclones), presenting a state of unpreparedness and a preventive capacity for shocks. The power system could not withstand the impacts of the TCA. Over 50 percent of the grid’s total active power was lost within 24 h, with a degradation rate almost six times higher than the recovery rate. The power system’s lack of flexibility was observed throughout the duration, and the infrastructure remained dilapidated, which led to reduced electricity delivery. The power system stayed for up to 23 h in a reduced functionality state. Up to 100 percent of the hydroelectric power generation machines were rendered out-of-service following the impacts of the TCA. A hydropower plant of a 130 MW capacity was left non-operational for 16 months, representing about 18 percent of the grid generation capacity. About 8 percent of transmission lines, which are key transmission lines supplying major load centres, were brought down, translating to a considerable share of grid-connected customers who were left with no electricity supply. The recovery rate was lower than the degradation rate during the TCA, which thwarts the recovery and restorative priorities. This reduced response capacity could be due to infrastructural and operational challenges, such as (a) an insufficient infrastructure capacity, (b) overdependence on hydro, weak, and aged infrastructure, and (c) the compromised response capacity of the grid operator, which could be due to a lack of repair materials, access to the disaster-hit areas, and a shortage of repair crews and resilience funds. Furthermore, no transformation happened after restoration, following the non-implementation of the national energy policy. The power system lacks the preventive, absorptive, adaptive, restorative, and transformative resilience capacities for power grid enhancement. Consequently, infrastructural and operational (focusing on improving the operator response) resilience improvement measures have been discussed, which aim to address the identified resilience challenges. While the proposed resilience enhancement measures have demonstrated their effectiveness in some studies, future research would benefit from evaluating other grid-hardening measures and economic modelling. Further studies on other grid-hardening measures could include (i) other generation-based enhancement measures such as the extent of generation expansion, the diversification of generation sources, and RE integration for specific resilience goals, and (ii) evaluating the effectiveness of different infrastructural enhancement measures. Economic modelling could include (i) comparing the cost of line rerouting and making the existing lines more robust, (ii) a cost–benefit analysis of different redundant enhancement scenarios to determine which of those would provide economic benefits, and (iii) a cost–benefit analysis of different infrastructural enhancement measures.