1. Introduction
Drought remains a serious threat to the livelihoods of millions worldwide. Combating drought is essential for achieving sustainable development goals, as it supports the preservation of ecosystem services and enhances the quality of life for millions in regions vulnerable to drought (E/CN.17/2009/19) [
1]. According to the same source, encouraging the establishment of scientifically based drought- and desertification-related local, national, and, where appropriate, regional indicators remains a challenge in combating drought.
Climate change, as well as other trends such as rapid urbanization, changes the water regime; hence, the sustainability concept is becoming an essential challenge for the scientific community. The impacts of emerging global climate change threats are very serious for urban areas due to the potentially high population density [
2], as well as threats to food security, ecosystems, and more. An interesting occurrence of climate change is that water flows are becoming more unpredictable due to the increased frequency of extreme weather events and changes in rainfall patterns over time and space. This results in more intense floods and droughts, severe wildfires, rising sea levels, polar ice melt, catastrophic storms, flooding, and a decline in biodiversity [
3]. Therefore, climate change is expected to increase the risk of hydrometeorological events with significant social, environmental, and economic impacts [
4,
5].
As the frequency of droughts is expected to increase in the coming years, their reliable detection, in combination with other indices, is a valuable tool for mapping drought-prone areas to mitigate the impact on water resources. This, in conjunction with the construction of suitable infrastructure (e.g., dams, rainwater harvesting systems) and legislative modifications, facilitates the sustainable management of water resources in accordance with the Sustainable Development Goals (SDGs) and circular economy principles.
Over the last few years, complex networks theory has been utilized to study the complicated environmental interactions for climate dynamics [
6,
7]. Spatial correlated climatic observables, such as heatwaves, drought or extreme floods, are modelled as complex networks highlighted as ‘teleconnections’ [
6,
8]. Complex network analysis explores the topological climate characteristics, aspiring to reveal the underlying mechanisms and contributing to weather forecasting of extreme events under climate change. Furthermore, the topological analysis can be used to identify hotspot regions of extreme events (extreme drought [
8] or rainfall [
6]).
In this work, we propose a method to analyze and characterize the intensity of meteorological drought by transforming the ‘inverted’ precipitation time series of an area into a complex network. This transformation is achieved using the visibility algorithm proposed by Lacassa et al. [
9]. Then, the structural properties of the network are studied, and the severity of the drought is evaluated. Thus, complex network measures (e.g., closeness centrality measure) can serve as indices for drought characterization.
The visibility method of Lacasa et al. [
9] has many applications, e.g., in physics and mechanics [
10,
11,
12], medicine [
13,
14] and finance [
15,
16]. For instance, in Charakopoulos et al. [
11], the authors use the visibility algorithm to analyze the magnetohydrodynamic channel flow. Additionally, John R. and John M. [
17] apply the visibility algorithm to investigate the time lag between rainfall and water level fluctuations in Lake Okeechobee. Nevertheless, this is the first time (to our knowledge) that the visibility algorithm is combined with complex networks theory to estimate the characteristics of drought events, similar to well-known indices, e.g., Standardised Precipitation Index (SPI).
We highlight here that the well-known drought indicators (RDI, SDI, SPI; RDI: Reconnaissance Drought Index, SDI: Standardized Discharge Index [
18,
19,
20]) use either mean values or the median and standard deviation based on historical data [
18,
21]. However, accurately estimating the mean value is challenging due to either limited data or climate change. Our method proposes a different approximation to characterize drought severity, liberated from the low-order statistics (e.g., mean and variance). The paper is organized as follows. In
Section 2, we present the methodological steps and the network measures that characterize the topological structure. In
Section 3, a description of the areas under study is given, and in
Section 4, we represent the results for the Arnaia and Florina areas. Finally, we conclude in
Section 5.
2. Materials and Methods
Initially, we present the construction of a complex network from a time series based on the visibility algorithm in [
9,
11]. According to the visibility algorithm, nodes correspond to time (years in our case), and edges are created in proportion to the intensity of the yearly drought value. Thus, high-intensive drought years (i.e., nodes with high visibility) are expected to play a central role in the network structure.
Consequently, one can study and analyze the topological properties (the structure of the network) and characterize the drought years using an arsenal of network measures. For instance, by utilizing centrality measures, one can characterize which year is important in the evolution of droughts. These measures are degree centrality, closeness centrality, and betweenness centrality.
Another important marker for drought organization is the modularity index (i.e., the detection of network communities), which serves as an alternative method to define the meteorological drought cyclicity. Different communities correspond to different cycles of drought. In total, network analysis can be used to study the organization of drought and to identify rare (extreme) drought events, offering valuable information for the sustainability of water resources.
2.1. Network Construction Using Visibility Algorithm
We let
a given discrete time series. Each data point
constitutes a node
i in the network. Two nodes are connected, i.e.,
if they satisfy the visibility criterion: Any intermediate point
between
and
does not obstruct the view line of
. To be more specific, we express the visibility principle using the slope of line segment. We let
be the slope of line segment
with ending points
which is defined as
. We let
, with
, any intermediate point of
. Then, the point
C is visible from
A if for each
:
Solving Equation (
1) with respect to
we obtain the visibility condition:
Equation (
2) simply says that there is a ‘visibility line’ between
A and
C, which does not intersect any intermediate data height [
9]. A concrete example is given in
Figure 1. The time series is depicted in
Figure 1A, which contains 8 points, from 1 to 8. Point A (for x = 1) has visibility at Point B but does not have visibility at Point C. This is explained as follows: the intermediate Point B between A and C ‘blocks’ the view (the grey line from A to C). Using the mathematical definition of the slope of the segment, we determine for the intermediate Point B,
, that is, the visibility property of Equation (
1) is not satisfied.
Employing the visibility algorithm, we transform the time series into a complex network; see
Figure 1B. The numbering represents the abscissa points (e.g., the years sequentially). The network can be represented using the adjacency or connectivity matrix
A. If
nodes are connected, then
, otherwise
. For example, Node 1 is connected with Node 2, which implies
, while Nodes 1 and 3 are not connected; thus,
. In
Figure 1C, we show the corresponding adjacency matrix of the example network in
Figure 1D.
The aforementioned procedure defines a complex network which can be presented as graph
, where
N is the set of nodes and
E is the the set of edges. The network connectivity is ‘stored’ in the adjacency matrix
A. The elements
of the adjacency matrix
A have values 1 and 0 according to connectivity, i.e.,
where
indicates that the
nodes are connected. Using the adjacency matrix, we compute important network measures that characterize the connectivity. A representative set of network measures contains the degree centrality, the betweenness centrality, the man path length and the community detection or modularity.
2.2. Measures Characterizing the Drought Network Connectivity
Network measures are used to obtain information about the connectivity structure of the network. The main categories are centralities, which characterize the importance of a node, clustering measures which describe motifs and how they appear in the networks, and community detection, which is a partition of the network into different subgroups [
22,
23,
24,
25,
26,
27,
28,
29].
Network quantities can be separated into local, when the quantity describes a node property, and global (or macroscopic), when results are obtained from statistics over all the members of the network. For example, the statistical distribution of a network property and its first-order statistics, i.e., the mean value, are obtained as global measures.
2.2.1. Degree Centrality
The degree of a node
i or
degree centrality refers to the number of links connected to it [
22]. In the case of undirected network (i node connected with j and vice versa), the degree centrality is defined as
where
is the element of the adjacency matrix
A (i.e.,
iff
). A high degree of connectivity (increased numbers of links) of the
ith node defines the importance of a node in the network (called degree centrality). The degree distribution
defines the probability of a randomly selected node having a specific degree
k.
2.2.2. Path Lengths, Closeness Centrality and Clustering Coefficient
A path between nodes
with a minimum number of successive edges (steps) constitutes a geodesic path. The minimum number of steps between these two nodes in the network defines the shortest path length or distance
between node
i and
j. Averaging over the set of all shortest paths, we obtain the mean path length of the network [
22,
30]:
The mean path length shows how fast the information can be spread in the network. A low mean shortest path length
shows that any two random nodes
can interact very fast. A similar phenomenon is known: the six-degree social connectedness of Kevin Bacon [
31]).
A centrality measure which estimates the importance of a node with respect to distance
from other nodes is
closeness centrality. Nodes that are in a central position in the network are closer to all other nodes. Closeness centrality is defined as
and it computes the inverse sum of the shortest paths with respect to the
ith node. A central node
i in the network, form short distances from the other nodes, which implies the denominator
, is small; thus, the closeness centrality
is expected to admit a high value.
Another important measure which characterizes connectivity is the
clustering coefficient. It measures the proportion of triangle loops that exist in a node. Specifically, the clustering coefficient of node
i is defined as the following ratio:
The higher the number of triangles (that exist) with respect to the
ith node, the higher the clustering coefficient.
2.2.3. Betweenness Centrality
Another important measure, which quantifies the significance of a node, is
betweenness centrality. Betweenness centrality measures the amount of influence which a node has with respect to the total information flow in the network. Important nodes that connect different subgraphs in the network (i.e., act as a bridge) show high betweenness centrality. The betweenness centrality
of the
ith node is mathematically defined as the fraction of all shortest paths in the network that pass through the node; that is,
where
is the number of shortest paths from
j to
k passing over
i, and
is the number of shortest paths between nodes
j and
k. Bridging nodes that connect different subsets of the network often have high betweenness centrality. Higher values of
indicate that the node acts as a central hub.
2.3. Detection of Communities and Modularity
In many cases, networks can be partitioned into subsets (subgraphs) such that there are dense internal connectivity (connectivity among nodes in the subset) and sparse connections to other subgraphs [
24]. The partition of the network into densely connected subgraphs (or communities) plays a significant role in information processing within the network since it offers a modular view of the process on the network [
32,
33].
From graph theory, it is proven that the expected number of edges (if edges were positioned at random) is
[
24,
34], where the number
is the total number of edges in the network. Quantity
expresses the deviation of the subgraph
K connectivity from an equivalent random network (with the same degrees of nodes); not, however, when it is constructed randomly, i.e., using the configuration model ([
35], Chapter 13).
According to [
24], quantity
Q is extended as follows: (a) The modality index assigns a community number
to each node. For example, if there are two communities, then
. Then, the term
if
have the same sign (i.e., belong to the same community) and 0 otherwise. (b) The modularity (or modality) is given by
Using matrix multiplication, the last sum is written as follows:
where
is the resultant modularity matrix. Here, we seek the best network partition, i.e., we separate the network into subgraphs in order to maximize the modularity function
. The matrix
B has the form of a graph Laplacian matrix, and in such matrices, the optimizations can be achieved using graph partitioning or spectral partitioning (eigenvalue–eigenvector decomposition) of
B [
24,
36].
2.4. Data and Drought Time Series Preprocess
For the drought analysis, the precipitation data are transformed as follows:
an then, we normalize as follows:
Clearly,
, where value
corresponds to the case of maximum precipitation, and
, in the case of minimum precipitation, which implies that
expresses the drought evolution as ‘inverse’ precipitation.
5. Discussion
In this study, a new approximation to investigate and analyze the meteorological drought is proposed. Drought severity is characterized by network measures such as degree, betweenness and closeness centrality. One advantage of the network method is that it is not based on the mean value and standard deviation as opposed to classical indicators of drought (RDI, SPI, and SDI), which are dependent on low-order statistics.
Both cases (Arnaia and Florina regions) show structural network similarities; in particular, both networks display high centrality nodes. As we show in
Figure 4 and
Figure 8, these nodes are the extreme drought years. The other computed network measures (degree centrality, closeness centrality) confirm the localization of these extreme drought events.
5.1. Utilizing Network Measure Distributions to Characterize Drought Rare Events
The computed distributions of the network measures offer a macroscopic view of network connectivity, i.e., the distributions show the properties of the entire connectivity (in a period of 57 years). The main macroscopic characteristic that is revealed is the heterogeneous structure of the networks: in both cases (Arnaia and Florina), there are few nodes acting as ‘Hubs’, i.e., nodes with high centrality (degree, betweenness and closeness centrality). These are identified as extreme drought years. Indeed, in
Figure 6 and
Figure 10, the 95% percentile value is also depicted, i.e., the value which indicates that the 95% of all measurements are below this value. In the case of Arnaia, these are nodes 24 → 1984–1985, 39 → 1999–2000 and 51 → 2011–2012. Similarly, in the case of Florina, the extreme-rare drought years are 4 → 1964–1965, 29 → 1989–1990 and 46 → 2006–2007.
Regarding the results, they are compatible with previous studies [
5,
38]. Indeed, in Kourtis et al. [
5], the case of Chrysoupoli station is reported (
Figure 9 in [
5], 100 km distance from Arnaia). As measured in [
5] (
Figure 9 therein), the drought patterns are in very good agreement with our results. Specifically, the following major drought events are reported in [
5]: 1984–1985, 1989–1990, 1991–1992 and 1999–2000. Regarding the severe drought in the whole Greek area, the results are not identical concerning the spatial distribution. For instance, a severe drought took place in the Cyclades and Athens during the hydrological year 1999–2000 (based on the RDI,
) according to [
38], which is compatible with our results regarding both the Arnaia and the Florina stations. However, in this year, the intensity of drought cannot be characterized as severe in the case of Thessaly [
38]. During 1989–1999, a severe drought occurred in Athens and Cyclades, as well as in our study in Florina, but this does not hold true for Thessaly [
38].
In conclusion, our analysis characterizes rare (extreme) events: nodes (or years) as the values above 95% percentile of betweenness centrality. Our probabilistic approach has similarities with other statistical methods for drought identification. For example, one of the most widely used methodologies is the statistical Theory of Runs (ToR) proposed by Yevjevich [
39]. Although ToR is not a drought index or indicator, it can be used to analyze hydrometeorological time series in order to differentiate drought as a feature [
40]. A ‘run’, according to this theory, is a segment of a drought time series, in which each value is either above or below the prescribed truncation level [
41].
5.2. Comparison with the Standardized Precipitation Index-SPI
The annual Standardized Precipitation Index (SPI) was calculated for both examined areas. The SPI is based on a standardization method [
18,
20], where if SPI values are below −2 (SPI
), an extreme drought event takes place. Subsequently, the linear correlation was determined by comparing the SPI annual values with degree centrality, closeness centrality, betweenness centrality and the clustering coefficient. For both cases, the linear correlation was closest to −1 when applying the closeness coefficient. More specifically, in the case of the Arnaia station, the linear correlation coefficients were equal to −0.55, −0.71, 0.64 and −0.8 when comparing the annual SPI with the annual values of betweenness centrality, degree centrality, clustering coefficient and closeness centrality, respectively. In the case of the Florina station, the linear correlation coefficients were −0.57, −0.59, 0.66, and −0.78 for the same comparisons.
In summary, closeness centrality can be used to describe the intensity of drought, producing results comparable to the existing methodology for assessing the intensity of meteorological drought. For illustration purposes, in
Figure 11, we depict the closeness centrality measure multiplied by (−1) to achieve the same monotony with the annual SPI (then the correlation becomes positive). In the cases of both Arnaia and Florina, the annual SPI is in very good agreement with closeness centrality.
5.3. Emergent Cyclicity or Periodicity from Network Analysis
The modularity algorithm [
24,
36] (i.e., the network communities detection) revealed the organization of the drought into these extreme events. The networks of Arnaia and Florina were partitioned into successive periods with central nodes and extreme drought events. The successive periods are highlighted in the color code in
Figure 3B and
Figure 7B. Consequently, the network-proposed method can be used to define meteorological mild drought periods, which are separated due to new strong drought extreme events. These periods have ten members on average (in the case of Arnaia, we found a periodicity of 9 years, while in Florina’s case there was a periodicity of 11 years).
In a second, careful view of the network measures, we observe two cyclicity behaviors. Counting the period between local maxima in all network measures, we observe a periodicity or cyclicity of approximately ∼4 years. Concluding, the network approximations reveal two cycles. The first one is almost 10 years whilst the second appears between 3 and 4 years. Remarkably, the authors in [
42,
43] studied the Mediterranean drought and concluded with similar results. Specifically, they reported two periods of cyclicity, one of almost 10 years (9.4) and a second short-term cyclicity behavior (in the Iberian Peninsula). Moreover, by using Fourier analysis, Moreira et al. [
43] suggested the existence of the two most frequent cycles, with periods of 6 years and 9.4 years, respectively, in Portugal.
One future network approximation topic is using the network features in a machine learning (ML) framework. The network features can be used to create a model for predicting or classifying drought. For example, betweenness centrality has a power law form with . Furthermore, including a higher number of drought data regions, one could study spatial correlations of climatic observables in future work, e.g., drought or extreme floods. Thus, complex network analysis could be applied to explore and reveal possible mechanisms of weather observables (flood or precipitation).