Finding Top- k Nodes for Temporal Closeness in Large Temporal Graphs

: The harmonic closeness centrality measure associates, to each node of a graph, the average of the inverse of its distances from all the other nodes (by assuming that unreachable nodes are at inﬁnite distance). This notion has been adapted to temporal graphs (that is, graphs in which edges can appear and disappear during time) and in this paper we address the question of ﬁnding the top- k nodes for this metric. Computing the temporal closeness for one node can be done in O ( m ) time, where m is the number of temporal edges. Therefore computing exactly the closeness for all nodes, in order to ﬁnd the ones with top closeness, would require O ( nm ) time, where n is the number of nodes. This time complexity is intractable for large temporal graphs. Instead, we show how this measure can be efﬁciently approximated by using a “backward” temporal breadth-ﬁrst search algorithm and a classical sampling technique. Our experimental results show that the approximation is excellent for nodes with high closeness, allowing us to detect them in practice in a fraction of the time needed for computing the exact closeness of all nodes. We validate our approach with an extensive set of experiments.


Introduction
Determining indices capable of capturing the importance of a node in a complex network has been an active research area since the end of the forties, especially in the field of social network analysis where the ultimate goal has always been to develop theories "to explain the human behavior" [1]. After observing "that centrality is an important structural attribute of social networks", and that there "is certainly no unanimity on exactly what centrality is or on its conceptual foundations", in [2] the author proposed such a conceptual foundation of centrality by making use of graph theory concepts. The node indices proposed in that paper (that is, the degree centrality, the betweenness centrality, and the closeness centrality) have become quite standard notions in complex network analysis. For two of them in particular, that is, closeness and betweenness, a quite large amount of literature has been devoted to the design, analysis and experimental validation of efficient algorithms for computing them, either exactly (e.g., the well-celebrated Brandes' algorithm for computing the betweenness [3]) or approximately (e.g., the sampling approximation algorithm for estimating the closeness [4]), especially after that very large network data have become available, thus making the searching of very efficient algorithms a necessity. Reporting all the results obtained in this direction is clearly out of the scope of this paper: we refer the interested reader to one of the several surveys that appeared in the literature (such as [5]), to one of the several more conceptual works (such as [6]), or to the excellent periodic table of network centrality shown in [7].
In this paper, we focus our attention to the closeness centrality measure, which associates to each node of a graph its average distance from all the other nodes (since we will deal with unweighted graphs only, the distance between two nodes u and v is simply the number of edges included in the shortest path from u to v). In order to deal with the case of (weakly connected) directed graphs, two main alternatives are available when formally defining this measure: one approach assumes that the number of nodes reachable from a node u is known (see, for example, [8]), while the other, which is also called harmonic centrality, uses the inverse of the distances in order to deal with disconnected pairs of nodes (see, for example, [9]). Since in this paper we will use the temporal analogue of the second alternative, we limit ourselves to give the following formal definition. Given a directed graph G = (V, E), the (harmonic) closeness of a node u ∈ V is defined as C(u) = 1 n−1 ∑ v∈V:v =u 1 d(u,v) , where d(u, v) denotes the number of edges included in the shortest path from u to v (by convention, d(u, v) = ∞ if there is no path connecting u to v). The harmonic closeness of a node is a value between 0 and 1: the closer is C(u) to 1, the more important the node u is usually considered. For instance, in a directed star with n nodes, there is one node whose closeness is equal to 1, while all other nodes have closeness equal to 0. On the contrary, in a directed cycle with n nodes, all nodes have closeness H n−1 , where H k denotes the k-th harmonic number (that is, the sum of the reciprocals of the first k natural numbers).
Computing the closeness of a node u in a directed (unweighted) graph is simple: we just have to perform a breadth-first search starting from u and sum the inverse of the distances to all the nodes reached by the search. This requires O(m) time and O(n) space, where n denotes the number of nodes and m denotes the number of edges. However, we are usually interested in comparing the closeness of all the nodes of the graph in order to rank them according to their centrality. This implies that we have to perform a breadth-first search starting from each node of the graph, thus requiring time O(nm). This computational time is unavoidable (as shown in [10]), unless the strong exponential time hypothesis [11] fails. However, in the case of real-world complex networks, the number of nodes and of edges is typically so large that this algorithm is practically useless. For this reason, several approaches have been followed in order to deal with huge graphs, such as computing an approximation of the closeness centrality (see, for example, [4,12]) or limiting ourselves to find the top-k nodes with respect to the closeness centrality [10]. These algorithms turn out to be so effective and efficient that several of them are already included in well-known and widespread used network analysis software libraries (such as [13,14]).
So far, we have talked about static graphs, that is, graphs whose topology does not change over time. In this paper, however, we will focus on (directed) relationships which have timestamps. This led the research community to the definition of temporal graphs, that is, (unweighted) graphs in which edges are active at specific time instants: for this reason, we call them temporal edges and we denote them by triples (u, v, t), where t is the appearing time of the temporal edge connecting u and v. Temporal graphs are ubiquitous in real life: phone call networks, physical proximity networks, protein interaction networks, stock exchange networks, and public transportation networks are all examples of temporal graphs, in which the nodes are related to each other at different time instants. Until recently, the time dimension has been often neglected by aggregating the contacts between vertices to (possibly weighted) edges, even in cases when detailed information on the temporal sequences of contacts or interactions would have been easily available. For example, almost all collaboration networks (such as the scientific or professional collaboration networks) have been almost always analyzed without taking into account the time of the collaboration, even when this information was easily available (such as in the case of the information given by the DBLP computer science bibliography web site).
However, if the temporal information is just ignored, we can lose important properties of the graph and we can even deduce wrong consequences. For example, in the case of the temporal undirected graph shown in the left part of Figure 1, if we ignore the temporal information associated with the edges, we can erroneously conclude that there exists a path starting from node a, arriving at node c, and visiting the other node b. However, this path does not correspond to a temporally-feasible path, since the edge connecting node a to node b appears after the edge connecting b to c: in other words, when we arrive in b it is too late to take the edge towards c. It is then important to analyze temporal graph properties by taking into account the temporal information concerning the time intervals in which specific edges appear in the graph. For this reason, the community has rethought several classical definitions of graph theory in terms of temporal graphs [15][16][17].
One of such definition is the one of closeness centrality, which has been repeatedly reconsidered in the case of temporal graphs [18][19][20][21][22][23][24][25][26][27]. In several of these papers, the authors refer to the classical definition of closeness centrality (that is, the one based on the average temporal distance), but in many cases, they actually consider the temporal analogue of the harmonic closeness centrality. In both cases, however, the first step to perform in order to rethink the definition of closeness in terms of temporal graphs consists of defining the temporal distance between two nodes. Even if different notions of distance have been introduced while working with temporal graphs (see, for example, [28]), in this paper, we will focus only on one specific distance definition, which is, in our opinion, one of the most natural ones: that is, the time duration of the earliest arrival path starting no earlier than a specific time instant. This definition is motivated, for example, by the following typical query one could pose to a public transport network: if I want to leave no earlier than time t, how long does it take to me to go from a (bus/metro/train) station to another station?
More precisely, for any time instant t, a temporal t-path (also called t-journey) is a sequence of edges such that the first edge appears no earlier than t and each edge appears later than the edges preceding it. Its arrival time is the appearing time of its last edge and its duration is the difference between its arrival time and t (plus one in order to include the traveling time along the last edge). The t-distance d t (u, v) from a node u to a node v is then the minimum duration of any temporal t-path connecting u to v and having the smallest arrival time (once again, if there is no t-path from u to v, we will assume that d t (u, v) = ∞). For instance, in the case of the temporal triangle in the left part of Figure 1, we have that d 1 (c, a) = 2 − 1 + 1 = 2, while d 2 (c, a) = 4 − 2 + 1 = 3: indeed, if we insist in leaving after time 1, we cannot arrive at a before time 4. Note that, for any t ∈ (2, 4], d t (b, a) = d t (b, c) = ∞, since there are no temporal edges incident to b with appearing time greater than 2.  Once we have a definition of distance from a node to another node, we can define the notion of temporal closeness centrality of a node u at a given time instant t by simply applying the harmonic definition of closeness in the case of a static graph (see, for example, [9]). Note that we refer to the harmonic closeness centrality, since, as in the case of weakly connected directed graphs, this definition allows us to deal with the fact that two nodes might not be connected by a temporal path. More precisely, the t-closeness of a node u is defined as C t (u) = 1 n−1 ∑ v∈V:v =u 1 d t (u,v) . In [22], the evolution of C t (u) was analysed in the case of two social networks (an e-mail graph and a contact graph). To this aim, the authors used an algorithm (inspired by [29]) for computing the t-closeness of a node of a temporal graph, whose time complexity is linear in the number m of temporal edges and whose space complexity is linear in the number n of nodes. For example, we can apply this algorithm to analyse and compare the evolution of the t-closeness in the case of two actors, by referring to the IMDB collaboration graph, where the nodes are the actors and the temporal edges correspond to collaborations in the same (non TV) movie (the appearing time of an edge is the year of the movie). In the left part of Figure 2, we show the evolution of the t-closeness of Christopher Lee and Eleanor Parker (two actors who were alive approximately in the same period) (Note that the t-closeness is greater than zero even when t is less than the birth year of the corresponding actor. This is not a contradiction, since, in general, a temporal edge may contribute to the t-closeness of a node for all t preceding the appearing time of the temporal edge itself). As can be seen, the two plots are quite similar until the end of the sixties (even if the plot of Parker has a smaller peak). Successively, Parker drastically reduced her activity (indeed, after The sound of music in 1965, she participated to only six not very successful movies), while Lee had two other growing periods (most likely, the second one is related to his participation to the Star Wars and the The Lord of the Rings sagas). The figure thus suggests that Lee has been more "important" than Parker. 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990  In order to capture this idea formally, we introduce a global temporal closeness centrality of a node u in a given time interval [t 1 , t 2 ] which is based on computing the integral of C t (u) over that interval. That is, for any node u in the graph, we compute and analyse the temporal closeness C(u) of u, which is defined as (intuitively, C(u) can be seen as the Area Under Curve (AUC) value of the function C t (u)). This is similar to what is done in [18] for the betweenness centrality (which is connected to the number of shortest paths that pass through a node): since their betweenness definition depends on the time at which a node is considered, they average it to obtain a global value. Here the integral is the natural equivalent of the average, for a continuous function. For example, the closeness of Christopher Lee is approximately equal to 0.005, while the closeness of Eleanor Parker is approximately equal to 0.003, thus confirming the previous intuition that Lee is more central than Parker. The right part of Figure 2, instead, shows the top 20 nodes in the public transport temporal graph of Paris with respect to the temporal closeness centrality measure (the used graph is derived by the data set published in [30], and successively adapted to the temporal graph framework in [31]).

Our Results
Our first contribution is the design and analysis of an algorithm for computing the temporal closeness of a node of a temporal graph in a given time interval, whose time complexity is linear in the number m of temporal edges and whose space complexity is linear in the number n of nodes. This algorithm, which is an appropriate modification of the one used in [22] and adapted from [29], can be seen as a temporal version of the classical breadth-first search algorithm. Computing the temporal closeness of all nodes in order to compare them and find the nodes with highest temporal closeness can, hence, be done in time O(nm), by applying n times the algorithm for computing the temporal closeness of a node. This time complexity, however, is much too high in the case of large temporal graphs.
Our second and more important contribution is showing that the algorithm for computing the temporal closeness of a node can be modified in order to obtain a backward version of the algorithm itself, which allows us to compute the "contribution" C(u, d) of a specific node d to the temporal closeness of any other node u. By using this algorithm (which is inspired by the earliest arrival profile algorithms of [32]), we can then implement a temporal version of the sampling algorithm introduced in [4] in order to approximate the closeness in static graphs. In particular, for a temporal graph with n nodes and m temporal edges, we can compute an estimate of the temporal closeness of all its nodes whose absolute error is bounded by in time O log n 2 m , which significantly improves over the time complexity of applying n times the algorithm for computing the temporal closeness of a node, that is, O(nm).
There is a natural way of using this temporal closeness estimation to empirically find the exact top-k nodes according to our temporal closeness metric. This approach simply consists in running our estimate temporal closeness computation algorithm, in finding the top-K nodes for the estimated temporal closeness, with K > k, and in computing the exact temporal closeness of these nodes. Our third contribution is an extensive experimental validation of this approach with a dataset of 45 medium/large temporal graphs. Indeed, we show empirically that using this method we can retrieve the actual top-100 nodes of all large graphs we have considered by choosing K = 1024, that is (with a little abuse of notation), in time O(2048 × m), which is between 10 and 100 times faster than computing the temporal closeness of all nodes.

Other Related Work
Besides the references given above, our paper is related to all work on the definition and computation of different temporal centrality measures, such as the temporal betweenness centrality defined in [33], the f -PageRank centrality defined in [34], or the temporal reachability used in [35], just to mention some recent ones. The authors of [36], instead, study the evolution of the closeness centrality for static graphs and propose efficient algorithms for computing it. In the case of static graphs, an approach based on the sampling approximation algorithm of [4], in order to select the candidates for which computing the exact closeness, was proposed in [37]: its complexity is, however, still quite high, that is,Õ(n 2/3 m) (under the rather strong assumption that closeness values are uniformly distributed between 0 and the diameter). Still, in the case of static graphs, in [38] the authors identify the candidates by looking for central nodes with respect to a "simpler" centrality measure (for instance, degree of nodes).

Structure of the Paper
In the rest of this section, we give all the necessary definitions concerning temporal paths, temporal distances and temporal closeness (these definitions are mostly inspired by [16,28,39]). In Section 2 we introduce and analyze our algorithm for computing the temporal closeness of a node of a temporal graph in a given time interval, while in Section 3 we describe and analyze the backward version of this algorithm and we show how this version can be used in order to obtain an error-guaranteed estimate of the temporal closeness of all the nodes of a temporal graph. In these two sections, we assume that the temporal edges have all distinct appearing times: in Section 4, we show how our algorithms can be adapted (without worsening the time and space complexity) to the more general and more realistic case in which multiple edges can appear at the same time. In Section 5 we experimentally validate our approximation algorithm and we show how it can be applied to the problem of finding the top nodes in real-world medium/large temporal graphs. Finally, in Section 6 we conclude by suggesting some research directions and possibly other applications of our backward temporal breadth-first search algorithm.

Definitions and Notations
A temporal graph is a pair G = (V, E), where V is the set of nodes and E is the set of temporal edges. A temporal edge e ∈ E is a triple (u, v, t), where u, v ∈ V are the source and destination nodes of the edge, respectively, and t ∈ N is the appearing time of the edge. If the temporal edges are bidirectional, then (u, v, t) can be also written as (v, u, t). Let t α (respectively, t ω ) denote the minimum (respectively, maximum) appearing time of a temporal edge in E. The time horizon T (G) of a temporal graph G is the interval [t α , t ω ] of real numbers no smaller than t α and no greater than t ω . In this paper, we will assume that the temporal edges are given to the algorithms one after the other (similarly, to the streaming model) either in non-decreasing or in non-increasing order with respect to the appearing time.
A temporal path P (also called a temporal walk [17]) in a temporal graph G = (V, E) from a node u ∈ V to a node v ∈ V is a sequence of temporal edges e 1 = (u 1 , v 1 , t 1 ), e 2 = (u 2 , v 2 , t 2 ), . . . , e k = (u k , v k , t k ) such that u = u 1 , v = v k , and, for each i with 1 < i ≤ k, u i = v i−1 and t i ≥ t i−1 + 1. The length of a temporal path is the number of temporal edges included in it. The starting time (respectively, ending time) of a temporal path P, denoted by σ(P) (respectively, η(P)), is equal to the appearing time of the first (respectively, last) temporal edge in the path. Given a time t ∈ T (G) and two nodes u and v, we will denote by P ≥ (u, v, t) the set of all temporal paths P from u to v such that σ(P) ≥ t. Among all these temporal paths, in this paper we will distinguish the ones which allow us to arrive as early as possible.

Definition 1. Given a temporal graph G = (V, E), two nodes u and v in V, and a time t ∈ T
Given a time t ∈ T (G) and two nodes u and v, the t-duration of a path P ∈ P ≥ (u, v, t) is defined as δ(P) = η(P) − t + 1. Hence, an earliest arrival t-path is also a path in P ≥ (u, v, t) with minimum t-duration. For this reason, we will also call these paths the shortest t-paths from u to v.

Definition 2.
Given a temporal graph G = (V, E), two nodes u and v in V, and a time t ∈ T (G), the t-distance d t (u, v) from u to v is equal to the t-duration of any shortest t-path from u to v (by convention, if P ≥ (u, v, t) = ∅, then we set d t (u, v) = ∞).
Once we have introduced the notion of t-distance, we can also define the analog of the harmonic closeness centrality in static graphs as follows.
Definition 3. Given a temporal graph G = (V, E), a node u, and a time t ∈ T (G), the t-closeness of u is defined as .

An Example
Let us consider the temporal graph shown in the left part of Figure 1. In this case, t α = 1, t ω = 4, and T (G) = [1,4]. As shown in the right part of the figure, for any t ∈ [1,2], the duration of a shortest t-path from node a to node b is equal to d t (a, b) = 2 − t + 1 = 3 − t, while, for any t ∈ (2, 4], this duration is infinity since there is no t-path from node a to node b. On the other hand, for any t ∈ [1,4], the duration of a shortest t-path from node a to node c is equal to d t (a, c) = 4 − t + 1 = 5 − t. Hence, the closeness of node a is equal to Analogously, we can verify that the closeness of node b is C(b) ≈ 0.16, and that the closeness of node c is C(c) ≈ 0.23.

Computing the Closeness
In this section, we propose an algorithm for computing exactly the closeness of a node u of a temporal graph G. This algorithm can be seen as a temporal version of the breadth-first search algorithm starting from a source node s, in which the temporal edges are scanned in non-decreasing order with respect to their appearing time. In the following, we assume that the appearing times of all temporal edges are distinct: the algorithm can be adapted to the case in which this assumption is not satisfied, as we will see below. Moreover, we assume that the temporal graph is directed: if this is not the case, we simply have to examine each edge twice by inverting the source and the destination.
The algorithm maintains, for each node x of G, a triple τ x = (l x , r x , a x ), which indicates that, for any time instant t in (l x , r x ], any earliest arrival t-path P from s to x has ending time η(P) equal to a x (see Algorithm 1). At the beginning, we do not know anything about the reachability of a node x from s: hence, we set the arrival time of x equal to ∞ for an arbitrary time interval (for example, (t α − 2, t α − 1]) preceding t α (line 1). When we read a new temporal edge (x, y, t), we first set τ s = {(t − 1, t, t − 1)}, since, clearly, the source node is always reachable even before the appearance of the edge (line 2). Let τ x = (l x , r x , t x ) and τ y = (l y , r y , t y ) be the two triples associated with x and y, respectively. If r x > r y , then we add to the closeness of s the contribution of node y corresponding to the interval (l y , r y ] (line 3) and we update the triple associated with y by setting τ y = (r y , r x , t) (line 4). This update is justified by Lemma 1. When all temporal edges have been read, we add to the closeness of s the contribution of a node x corresponding to the interval (l x , r x ] (line 5), which is the last interval for which the earliest arrival time has been computed. The way of computing the contribution to the closeness of s (lines 3 and 5) is justified by the proof of Theorem 1.

Lemma 1.
Let G = (V, E) be a temporal graph and s ∈ V. For any u ∈ V with u = s, let Ξ u = τ u,0 , τ u,1 , . . . , τ u,h u be the sequence of triples τ u,i = (l u,i , r u,i , a u,i ) such that l u,0 = t α − 2, r u,0 = t α − 1, a u,0 = ∞, and, for 1 ≤ i ≤ h u , (l u,i , r u,i , a u,i ) is the triple assigned to τ[u] at the i-th execution of line 4 with y = u during the running of Algorithm 1 with input G and s (note that h u = 0 if this line is never executed with y = u). Then, for any u ∈ V with u = s, the intervals (l u,i , r u,i ], for 0 ≤ i ≤ h u , form a partition of the interval (t α − 2, r u,h u ], and, for any t ∈ T (G),

Algorithm 1: Algorithm for computing the closeness of a node
Data: Stream of temporal edges of a directed temporal graph G = (V, E) in non-decreasing order with respect to their appearing time and source s Result: Real number equal to the closeness of s for while there are other edges to be read do let e ← (x, y, t) be the next edge; if r x > r y then Proof. We prove the lemma by induction on the number k of temporal edges that have been read. In particular, for any k with 0 ≤ k ≤ |E|, let A(k) be the following statement.
For any u ∈ V with u = s, let Ξ k u = τ u,0 , τ u,1 , . . . , τ u,h k u be the prefix of Ξ u containing the triples assigned to τ[u] at line 4 with y = u after having read k edges. The intervals (l u,i , r u,i ], for 0 ≤ i ≤ h k u , form a partition of the interval (t α − 2, r u,h k u ], and, for any We now prove by induction on k that A(k) is true for any k with 0 ≤ k ≤ |E|.
Base case. k = 0. In this case, no edge has been read yet and, hence, line 4 has never been executed with y = u. We then have that, for any u ∈ V with u = s, h 0 is empty and the condition on the t-distances is "vacuosly" true. Hence, A(0) is true.
Induction step. Given k with 1 ≤ k ≤ |E|, suppose that A(k − 1) is true. We now prove that A(k) is also true. Let e = (x, y, t) be the k-th temporal edge read by the algorithm. Clearly, this edge has no influence on any other node than y (since the graph is directed). Hence, we have just to prove that the value of τ[y] is correctly updated. By the induction hypothesis, we know that the current value of τ[y] = (l y , r y , a y ) is such that, for any t ∈ [t α , r y ], the ending time of any earliest arrival t -path from s to y is at most a y < t. Hence, the edge e cannot improve these ending times since its appearing time is t. Analogously, we know that the current value of τ[x] = (l x , r x , a x ) is such that a x < t is the ending time of any earliest arrival t -path from s to x with t ∈ (l x , r x ]. If r x ≤ r y , the edge e does not add any information for the node y, since we already know the ending time of any earliest arrival t -path from s to y, for any t ≤ r y . On the contrary (see the left part of Figure 3), if r x > r y , then, for any time instant t ∈ (r y , r x ] (for which we did not know yet the corresponding ending time of any earliest arrival t -path from s to y), we can now say that we can first reach x (at time a x with r x ≤ a x < t), and then wait until the temporal edge e appears to move to y at time t: hence, for all these time instants, the earliest arrival time at y can now be set equal to t, that is, the value of τ[y] becomes (r y , r x , t) (note that subsequent edges cannot improve this value since their appearing times are greater than t). Hence, if Ξ k−1 y = τ y,0 , τ y,1 , . . . , τ u,h k−1 y , we have that Ξ k y = τ y,0 , τ y,1 , . . . , τ u,h k y with h k y = h k−1 y + 1 and τ u,h k y = (r y , r x , t). By induction hypothesis, the intervals (l y,i , r y,i ], for 0 ≤ i ≤ h k−1 y , form a partition of the interval (t α − 2, r y,h k−1 y ]: by adding the triple (r y , r x , t), we obtain a partition of the interval (t α − 2, r y,h k y ] (since r y = r y,h k−1 y and r x = r y,h k y ). From the previous argument, it also follows The lemma follows from the fact that its statement is exactly equivalent to A(|E|).  Proof. From Lemma 1, it follows that, with respect to the node u, contributes to the closeness of s with the value 1 a u,i −t+1 . Hence, the closeness of s is equal to Note that we used the maximum function in order to deal with the first interval whose left extreme is smaller than t α and whose right extreme can also be smaller than t α (whenever u is not reachable from s in the interval [t α , t ω ]). Observe that the interval (r u,h u , t ω ] might be non empty: however, if this the case, then we can conclude that if we start from s at time t in this interval, then we cannot reach u. That is, d t (s, u) = ∞ for any t ∈ (r u,h u , t ω ] and, hence, this interval does not contribute to the closeness of s. Hence, Algorithm 1 correctly computes the closeness of node s. The time complexity of the algorithm is clearly O(|E|), since each temporal edge is analyzed one time only, and the update operation requires constant time. The space complexity is linear in the number of nodes, since (apart from the value C), for each node u, we have to maintain just three numbers corresponding to the current value of τ [u].
From the previous theorem, it follows that if we want to compute the closeness of all nodes, this would take time O(|V||E|), since we have to execute Algorithm 1 for each source node s. This time complexity may turn out to be not acceptable in the case of real-world large size temporal graphs.
That is why in the next section, we propose an analog of the sampling algorithm used for approximating the closeness in the case of static graphs [4], based on an appropriate modification of Algorithm 1.

Approximating the Closeness
In order to approximate the closeness in temporal graphs, we first need to introduce the notion of the latest starting path. To this aim, given a time t ∈ T (G) and two nodes u and v, we will denote by P ≤ (u, v, t) the set of all temporal paths P from u to v such that η(P) ≤ t. Definition 4. Given a temporal graph G = (V, E), two nodes u and v in V, and a time t ∈ T (G), a path P ∈ P ≤ (u, v, t) is said to be the latest starting t-path if σ(P) = max{σ(P ) : ∀P ∈ P ≤ (u, v, t)}.
Moreover, we need to define the contribution of a destination node d to the closeness of another node u.
Definition 5. Given a temporal graph G = (V, E) and two distinct nodes d and u, the contribution of d to the closeness of u is defined as dt.
By convention, we also set C(u, u) = 0, for any node u ∈ V.
We now introduce a sort of backward version of Algorithm 1 (which can be seen as an adaptation of the earliest arrival profile algorithms proposed in [32]), which has to be applied to a destination node d, and that will allows us to compute, for any other node x, the contribution of d to the closeness of x (that is, C(x, d)) and, hence, to adapt to temporal graphs the well-known sampling technique already used in the case of classical graphs. Differently from the case of Algorithm 1, we assume that the temporal edges are scanned in non-increasing order with respect to their appearing times. Once again, we assume that the appearing times of all temporal edges are distinct (we will see in the next section how the algorithm can be adapted to the case in which this assumption is not satisfied), and that the temporal graph is directed (if this not the case, we simply have to examine each edge twice by inverting the source and the destination). The algorithm maintains, for each node x of G, a triple τ x = (l x , r x , s x ), which indicates that, for any time instant t in [l x , r x ), any latest starting t-path P from x to d has starting time σ(P) equal to s x (see Algorithm 2). At the beginning, we do not know anything about the reachability of d from a node x: hence, we set the starting time of x equal to ∞ for an arbitrary time interval (for example, [t ω + 1, t ω + 2)) following t ω (line 1). When we read a new temporal edge (x, y, t), we first set τ d = {(t, t + 1, t + 1)}, since, clearly, the destination node can always reach itself even starting after the appearance of the edge (line 2). Let τ x = (l x , r x , s x ) and τ y = (l y , r y , s y ) be the two triples associated with x and y, respectively. If l x > l y , then we add to C(x, d) the contribution corresponding to the interval [l x , r x ) (line 3) and we update the triple associated with x by setting τ x = (l y , l x , t) (line 4). This update is justified by Lemma 2. When all temporal edges have been read, for each node x, we add to C(x, d) the contribution corresponding to the interval [l x , r x ) and to the interval [t α , l x ) (line 5), which are the last intervals for which the latest starting time has been computed. The way of computing the contribution to C(x, d) (lines 3 and 5) is justified by the proof of Theorem 2.

Algorithm 2:
Algorithm for computing the closeness contribution of a node to all the others Data: Stream of temporal edges of a directed temporal graph G = (V, E) in non-increasing order with respect to their appearing time and destination d. Result: Array of real number containing the "contribution" of d to the closeness of all other nodes.
while there are other edges to be read do let e ← (x, y, t) be the next edge; Lemma 2. Let G = (V, E) be a temporal graph and d ∈ V. For any u ∈ V with u = d, let Ξ u = τ u,1 , τ u,2 , . . . , τ u,h u , τ u,h u +1 be the sequence of triples τ u,i = (l u,i , r u,i , s u,i ) such that l u,h u +1 = t ω + 1, r u,h u +1 = t ω + 2, s u,h u +1 = ∞, and, for 1 ≤ i ≤ h u , (l u,i , r u,i , s u,i ) is the triple assigned to τ[u] at the (h u + 1 − i)-th execution of line 4 with x = u during the running of Algorithm 2 with input G and d (note that h u = 0 if this line is never executed with x = u). Then, for any u ∈ V with u = d, the intervals [l u,i , r u,i ), for i ≤ i ≤ h u + 1, form a partition of the interval [l u,1 , t ω + 2), and, for any t ∈ T (G), Proof. We prove the lemma by induction on the number k of temporal edges that have been read. In particular, for any k with 0 ≤ k ≤ |E|, let S(k) be the following statement.
For any u ∈ V with u = s, let Ξ k u = τ u,h k u , . . . , τ u,h u +1 be the suffix of Ξ u containing the triples assigned to τ[u] at line 4 with x = u after having read k edges. The intervals [l u,i , r u,i ), for h k u ≤ i ≤ h u + 1, form a partition of the interval [l u,h k u , t ω + 2), and, for any t ∈ [l u,h k We now prove by induction on k that S(k) is true for any k with 0 ≤ k ≤ |E|. Base case. k = 0. In this case, no edge has been read yet and, hence, line 4 has never been executed with x = u. We then have that, for any u ∈ V with u = d, h 0 u = h u + 1, Ξ 0 u = τ u,h u +1 with τ u,h u +1 = (t ω + 1, t ω + 2, ∞), and, hence, [l u,h u +1 , r u,h u +1 ) = [t ω + 1, t ω + 2) = [l u,h 0 u , t ω + 2). Moreover, the interval [l u,h 0 u , t ω ] = [t ω + 1, t ω ] is empty and the condition on the t-distances is "vacuosly" true. Hence, S(0) is true.

Induction step.
Given k with 1 ≤ k ≤ |E|, suppose that S(k − 1) is true. We now prove that S(k) is also true. Let e = (x, y, t) be the k-th temporal edge read by the algorithm. Clearly, this edge has no influence on any other node than x (since the graph is directed). Hence, we have just to prove that the value of τ[x] is correctly updated. By the induction hypothesis, we know that the current value of τ[x] = (l x , r x , s x ) is such that, for any t ∈ [l x , t ω ], the starting time of any latest starting t -path from x to d is at least s x > t. Hence, the edge e cannot improve these starting times since its appearing time is t. Analogously, we know that the current value of τ[y] = (l y , r y , s y ) is such that s y > t is the starting time of any latest starting t -path from y to d with t ∈ [l y , r y ). If l y ≥ l x , the edge e does not add any information for the node x, since we already know the starting time of any latest starting t -path from x to d, for any t ≥ l x . On the contrary (see the right part of Figure 3), if l y < l x , then, for any time instant t ∈ [l y , l x ) (for which we did not know yet the corresponding latest starting time from x), we can now say that we can first reach y (at time t with t < s y ≤ l y by using the temporal edge e), and then wait until starting the path from y to d at time s y : hence, for all these time instants, the latest starting time at x can now be set equal to t, that is, the value of τ[x] becomes (l y , l x , t) (note that subsequent edges cannot improve this value since their appearing times are smaller than t).
The lemma follows from the fact that its statement is exactly equivalent to S(|E|). Proof. From Lemma 2, it follows that, with respect to the node u, the interval I = [l u,1 , t ω + 2] is partitioned into h u + 1 intervals [l u,i , r u,i ), such that, for any t ∈ [t α , t ω ], if s u,i < t ≤ s u,i+1 , then d t (u, d) = r u,i − t + 1. That is, each time instant t ∈ (s u,i , s u,i+1 ] ∩ [t α , t ω ] contributes to C(u, d) with the value 1 r u,i −t+1 . Hence, we have that Note that we used the minimum function in order to deal with the last interval whose right extreme is greater than t ω and whose left extreme can also be greater than t ω (whenever u cannot reach d in the interval [t α , t ω ]). Note also that the interval [t α , l u,1 ) might be non empty: if this the case, then we can conclude that if we start from u at time t in this interval, then we cannot reach d before l u,1 . That is, d t (u, d) = l u,1 − t + 1 for any t ∈ [t α , l u,1 ) and, hence, this interval contributes to C(u, d) with the first integral in the previous equation. Hence, Algorithm 2 correctly computes the contribution C(u, d) of node d to the closeness of node u.
The time complexity of the algorithm is clearly O(|E|), since each temporal edge is analyzed once only, and the update operation requires constant time. The space complexity is linear in the number of nodes, since, for each node u, we have to maintain (apart from the value C) just four numbers corresponding to the current value of τ [u] and to the starting value of the previous tuple.
From the definition of closeness and of C(u, d), it follows that This formula immediately suggests the following definition of an estimator of the closeness of a node. Definition 6. Given a temporal graph G = (V, E), a node u, and a (multi)set X = {x 1 , . . . , x h } of vertices in V, we define the closeness X-estimator of u in T (G) as Theorem 3. Let G = (V, E) be a temporal graph and X ⊆ V be a randomly chosen (multi)set of h nodes in G. If h = Θ(log n/ 2 ), then, for any node u ∈ V, |C(u) − C X (u)| ≤ with high probability.
The proof of the above theorem uses the same techniques of [4], and is very similar to the one given in [40] to analyze the absolute error of a sampling-based algorithm for computing distance distribution approximations in static graphs. For the sake of completeness, we give the complete proof. As a first step, the following lemma shows that the closeness estimator is unbiased. Lemma 3. Given a temporal graph G = (V, E) and a uniformly randomly chosen node x ∈ V, the expected value of C {x} (u) is equal to C(u).
Proof. Since x has been randomly chosen in a uniform way, we have that From the definition of the estimator and from the fact that h = 1, it follows that The lemma is thus proved.
In order to prove Theorem 3, we make use of the following application of the Hoeffding's inequality (see, for example, [41]).
Proof Theorem 3. Given a temporal graph G = (V, E), a node u, and a randomly chosen (multi)set X of nodes in V with X = {x 1 , . . . , x h }, we apply the above theorem by setting Moreover, from the definition of the estimator and from the fact that we can assume that the number n of nodes is at least equal to 2, we have that, for each i with 1 ≤ i ≤ h, Finally, we also have that Hence, from Theorem 4, it follows that, for any ≥ 0, By choosing h = 2 log n 2 , we then have that and the theorem thus follows.

Finding Top-K Nodes
Theorem 3 states that we can approximate the closeness centrality of all nodes of a temporal graph by using a sample of size h, which is logarithmic with respect to the number of nodes. In Section 5, we will show how this approximation method works particularly well for nodes with a high closeness. Based on this observation, a natural strategy for finding the top-k nodes consists of: (a) compute the approximated temporal closeness for all nodes, using a sample size h; (b) rank the nodes according to this estimation and select the top-K nodes, with K > k; and (c) compute the exact closeness of these K nodes, then rank them and select the top-k nodes. As we will see in Section 5, in practice choosing h = K = 1024 has worked in all cases we have investigated for finding the top-100 nodes. This leads to a total cost proportional to 2048 · m, which is a quite small (between 1/10 and 1/100) fraction of the cost that would be needed to compute the exact closeness for all nodes.

How to Deal with Multiple Edges
In the previous sections, we have assumed that, for each time t ∈ T (G), there exists at most one edge whose appearing time is equal to t. Clearly, this assumption is not realistic since, in the vast majority of real-world temporal graphs, many edges can appear at the same time. In this section, we show how we can modify Algorithm 2, in order to deal with this more general case (the modification of Algorithm 1 is similar). For the sake of clarity of exposition, we will assume that, for each node u, the algorithm maintains a list I u of triples (instead of just one triple): it is not difficult to show that only the last two triples are really necessary at each iteration of the algorithm, thus assuring that the algorithm itself has linear space complexity.
Let us suppose that a new temporal edge e = (x, y, t) arrives, and that the last triple inserted in I x (respectively, I y ) is (l x , r x , s x ) (respectively, (l y , r y , s y )). This implies that that if we want to arrive at the destination d in the interval [l x , r x ) (respectively, [l y , r y )), then we cannot start from x (respectively, y) later than s x (respectively, s y ). Remember that, in Algorithm 2, the temporal edges are scanned in non-increasing order with respect to their appearing times: hence, we know that t ≤ s x ≤ l x < r x (respectively, t ≤ s y ≤ l y < r y ). We now distinguish the following cases.

1.
t < s x ∧ t < s y . In this case, neither x nor y has yet used an edge at time t. Hence, we can update the set of intervals as we did in the case of edges with distinct appearing times. That is, if l y < l x , then add to I x the triple (l y , l x , t).

2.
t < s x ∧ t = s y . In this case, y has already "encountered" an edge at time t. Let (l y , r y , s y ) be the triple just before (l y , r y , s y ) in I y (note that l y = r y and that t = s y < s y ). If l y < l x , then we add to I x the triple (l y , l x , t): indeed, since t < s y , we now know that, to arrive at d in the interval [l y , l x ), we can start from u at time t (by using the edge e), wait until time s y , and then follow the journey from y to d.

3.
t = s x ∧ t < s y . In this case, x has already "encountered" an edge at time t. If l y < l x , then we extend to the left the triple of x until l y : indeed, since s x < s y , we now know that, even to arrive at d in the interval [l y , l x ), we can start at time t (by using the edge e), wait until time s y , and then follow the journey from y to d.

4.
t = s x ∧ t = s y . In this case, both x and y have already "encountered" an edge at time t. Let (l y , r y , s y ) be the triple just before (l y , r y , s y ) in I y (note that l y = r y and t = s y < s y ). Similarly to the previous case, if l y < l x , then we extend to the left the triple of x until l y .
Note that the modification of the contribution to C(x, d) has to be done only in the first two cases (and at the end of the while loop, in order to deal with the leftmost intervals). Note also that the four above cases require constant time, in order to be implemented: hence, the time complexity of the modified algorithm is still linear in the number of temporal edges.

Experimental Results
In our experiments, we used 45 medium/large real-world temporal graphs taken from different application domains, that is, collaboration, communication, and transportation domains. For sake of brevity, we describe our results by referring to a sample of our dataset (see Table 1): the entire dataset and the entire set of experimental results are shown in the Appendix A (Table A1). Here, we are going to use the following temporal graphs. Table 1. A sample of our dataset. For each graph we report the number of nodes, the number of temporal edges, and the running times (in seconds) of EXACT (the cell marked with * is an estimation) and APX-1024 (average among 50 experiments). The running times of APX-h, for any other value of h, can be estimated as h · t/1024, where t is the running time of APX-1024.

Undirected Graphs
Directed Graphs • TOPO. The nodes are autonomous systems and the temporal edges are connections between autonomous systems. The appearing time of an edge is the time-point of the corresponding connection [42][43][44]. • ALL, COME, FANT. Every node corresponds to an actor and two actors are connected by their collaboration in a movie, where the appearing time of an edge is the year of the movie. We use the whole temporal collaboration graph and the ones induced by the comedy and the fantasy genres [45]. • MELB. Nodes are transport stops and temporal edges are connections traversed by a public vehicle: the edge appearing time is the arrival time (see [30,31]). • FBWA. The nodes of the graph are Facebook users, and each directed temporal edge links the user writing a post to the user whose wall the post is written on [43,44,46]. • LINU. The communication graph of the Linux kernel mailing list. An edge (u, v, t) means that user u sent an email to user v at time t [44].
• TWIT. Tweets about the migrant crisis of 2015. A directed edge (u, v, t) means that user u retweeted a tweet of user v at time t [47,48].
For all graphs (apart from TWIT), we have computed the exact values of the closeness centrality for all nodes, in order to evaluate the quality of our approximation algorithm and of our ranking algorithm. In the case of TWIT, we have executed only the approximation algorithm in order to deduce some properties of the graph. Our computing platform is a machine with Intel(R) Xeon(R) CPU E5-2620 v3 at 2.40 GHz, 24 virtual cores, 128 Gb RAM, running Ubuntu Linux version 4.4.0-22-generic. The code was written in Java, using Java 1.8, and it is available at https://github.com/piluc/TemporalCloseness. Hereafter, we will refer to the exact algorithm as EXACT, and to our approximation algorithm as APX-h, where h denotes the sample size in the definition of the closeness estimator. For each graph in our dataset, we ran APX-h setting h equal to 32, 64, 128, 256, 512, 1024, and repeating each experiment 50 times.

Running Times
In Table 1 we also report, for each temporal graph, the running times in seconds of EXACT and APX-1024. In the case of TWIT, the EXACT value is an estimation, since the (sequential) execution of the algorithm would have taken approximately three years. In the case of APX-1024 we report the average running time over 50 experiments. We remark that there is very little variability in the execution time of Algorithm 2: hence, a quite precise estimation of the running times of APX-h, for any other value of h, can be obtained by considering the running time t of APX-1024 reported in Table 1, and by computing the value h · t/1024 (in particular, by taking h equal to the number of nodes, this is the formula used to compute the estimate of the running time of EXACT with input TWIT). As can be seen, the improvement of the running time of APX-1024 with respect to EXACT ranges from one to several orders of magnitude. As expected, this improvement is particularly evident in the case of large graphs, where we are able to compute an approximation of the closeness in less than 16 min instead of more than 5 days for ALL and in less than 8 h, instead of 3 years for TWIT.

Accuracy
In this section, we analyze the accuracy of the estimation performed by APX-h for different h. To this aim, we consider the following measures.

•
Mean Absolute Error (MAE) in each experiment. Namely, for each experiment, we compute ∑ v |C X (u) − C(u)|/n, where X is the sample of size h randomly chosen by APX-h. This is guaranteed to be bounded with high probability (see Theorem 3).

•
Relative Error (RE), which is defined, for a given node u and for a given sample X, as |C X (u) − C(u)|/C(u). We show that, even though we do not have any theoretical guarantee on this error, it is very low when considering nodes which are in the top of the ranking, while it gets bigger for peripheral nodes.
MAE as a function of the sample size. Figure 4 shows the behaviour of MAE as a function of the sample size, through box-and-whisker plots, where for each graph, and for each h (X-axis), the Y-axis reports the median (and also minimum, maximum, first and third quartiles) among 50 experiments of the MAEs obtained by running APX-h. For the sake of brevity, we show here just the plots for the graphs COME, FBWA, and MELB (the behaviour is similar for the other graphs). Clearly, the scale is different due to the different values of the closeness centrality of each graph. For the sake of completeness we report the average closeness of COME, FBWA, and MELB, which is, respectively 6.1 · 10 −4 , 5.4 · 10 −9 , and 2.5 · 10 −5 . As expected, when increasing the sample size h, the MAE gets consistently lower. In particular, this applies to the median but also to the variability, as we see that the window between the minimum and maximum and also the one between the quartiles reduces. In the case of h = 1024, if we compare the median of the MAEs with the corresponding average values of closeness for the three graphs we get an error of 8%, 4%, and 6%. RE as a function of the ranking. We now show that the behaviour of the RE of APX-h for all the nodes of each graph depends on their ranking. In particular, given a temporal graph with n nodes, let r be the ranking computed by EXACT and let r(i), for any i with 1 ≤ i ≤ n, be the node v having position i in the ranking r (smaller i means higher closeness). For each i, we compute the mean and the maximum RE over 50 experiments of APX-h when estimating the closeness of the node v = r(i): in the following, we denote by µRE(i) and mRE(i) these two values. Figure 5 reports, for each ranking position i, the maximum µRE(i) and mRE(i) of APX-1024 among all the nodes with position up to i, for the graphs COME, FBWA, and MELB (from top to bottom). More specifically, the black plots depict the behavior of max 1≤j≤i µRE(j), while the red dashed plots depict the behavior of max 1≤j≤i mRE(j). As can be seen, both the µRE(i) and mRE(i) are very small for nodes having high closeness value (thus low ranks), while they are larger for nodes having a lower closeness value (thus high ranks). This behavior is quite natural as nodes having lower closeness are less often "backward" reachable from the sample and their closeness is often estimated as zero, or whenever they are "backward" reached by the sample, their closeness is then overestimated. This induces a higher variability in general for their estimation. On the other hand, nodes having higher values of closeness behave more stably with respect to the chosen sample, leading to better estimation. The overall good results are shown by this experiment suggest that APX-h is able to give a very good estimation for the top-k nodes, i.e., the k nodes having higher closeness for a given constant k (see also Table A2 in the Appendix A, which shows the difference between the average RE of the top-100 nodes and of the other nodes, with respect to different sizes of the sample). However, it could happen that the closeness of nodes with high rank, because of their possibly higher value of RE, could be overestimated by APX-1024: thus, these nodes could overtake, in the ranking produced by APX-1024, nodes with higher closeness (and, hence, lower rank). We will show in the next section that this is not the case in all the graphs we have considered: intuitively, this phenomenon can be justified by the fact that the closeness of these nodes with high rank and high RE is so small that even a significant overestimation of it does not allow the nodes themselves to climb the top positions.  Figure 5. Relative error of APX-1024 as a function of rank position for the graphs COME, FBWA, and MELB. In particular, the horizontal axis corresponds to the position of a node in the exact ranking, while the black (respectively, red dashed) plot indicates the maximum average (respectively, maximum) RE (over 50 experiments) of all the nodes up to that position. The plot is in loglog-scale. Note that there are groups of nodes with very similar relative error: as a result of a preliminary analysis of this phenomenon, we noticed that this is due to the existence of several small cliques disconnected from the rest of the graph.

Ranking and Finding Top-K Nodes
In the following, we analyze the performance of APX-h for different values of h, when retrieving the ranking of the nodes according to their closeness. We first discuss the quality of the whole ranking found by APX-h. Motivated by our experimental findings, we then focus on the problem of computing the top-k central nodes for some fixed values of k.

Ranking Convergence
Here, we analyze the convergence of the Kendall's τ for the ranking retrieved by APX-h for different values of h (intuitively, the Kendall's τ measures the similarity between two rankings of the same universe). Let r be a reference ranking and let q be the ranking found by APX-h. We compare these whole rankings using the weighted variation of the Kendall's τ proposed in [49], which gives more weight (with hyperbolic decay) to inversions involving top nodes with respect to bottom ones. For all graphs (apart from TWIT), we used the exact ranking computed by means of EXACT as reference ranking r and we analysed τ for increasing values of h: we report in Figure 6 the average τ obtained by 50 runs of APX-h. As can be seen, the τ values become close to 1 very quickly, being always higher than 0.89 for h = 1024 (as shown in Table A3 in the Appendix A, in the entire dataset, the τ value is always higher than 0.865 for h = 1024). In the case of TWIT, we used, as the reference ranking r, a ranking obtained by running APX-1024 and we analyzed the τ values of APX-h up to h = 512. Once again, the obtained τ is greater than 0.9 already for h = 128. This result, combined with the analysis of the relative error depicted in Figure 5, strongly suggests that the strategy described at the end of Section 3 to find the top-k nodes might turn out to be very efficient: the verification of this hypothesis is the goal of the next section.

Computing Top-K
Given an integer k, we show how APX-h behaves when finding the top-k central nodes, observing where the top-k nodes in the ranking induced by EXACT appear in the ranking induced by APX-h. In particular, let r be the exact ranking, such that r(i) is the the vertex in the i-th position (smaller i corresponds to higher centrality), and let q be the ranking obtained by APX-h, and q −1 its inverse. Given k, we compute the maximum ranking q −1 (v) for the first k nodes v in r, namely γ(k) = max 1≤i≤k q −1 (r(i)). Hence, if k = 1, we are computing the position of the real top central node in the approximated ranking, while, for larger values of k, we are considering the worst-case positioning among the real top-k nodes. Figure 7 reports these values in the case k = 20 and for different values of h, for the graphs COME, FBWA, and MELB. In particular, for each h, it reports the median of the γ (20) values found among 50 experiments (together with the minimum, maximum, first, and third quartile). As can be seen, despite the fact that the variability is relatively high for small sample sizes (namely, for h = 32 and h = 64), already with h = 128 it significantly reduces. In particular, by setting h = 512, we have that the top-20 nodes are always in the first 512 positions of the ranking found by APX-512. This suggests that, in order to find the exact top-20, it is enough to run APX-512 and then compute the exact closeness of the top-512 found (in O(m) time each). We have verified this hypothesis for all the graphs in our dataset (apart from TWIT) and for different values of k (see Table 2 for the graphs in our sample dataset and Table A4 in the Appendix A for the entire dataset). As a matter of fact, in the case of large graphs (that is, with more than 20,000 nodes), we can compute the top-20 nodes by executing APX-h with h = 1024 and then compute the exact closeness of the top-h found: the time complexity of this approach would then be O(2048 × m) (which is between 10 and 100 times better than the exact approach, when applied to the large graphs in our dataset). In the case of smaller graphs, the experimental results, described in Table A4 of the Appendix A, show that a smaller value of h (that is, h = 256) is almost always sufficient, thus giving a similar speed-up. Even more impressive is the fact that, in the case of large graphs, the same value of h (that is, h = 1024) can be actually be used for finding the top-100 nodes.

Conclusions
We proposed a sampling-based approximation algorithm for the temporal closeness centrality measure, and we experimentally showed that this algorithm can be extremely efficient in computing the top-k nodes in real-world temporal graphs. An interesting open question is to understand why, in the case of few graphs, our method is not as efficient as in the case of all the others: some preliminary experimental results suggest that this might happen because all nodes have basically the same temporal closeness which is very small. In order to attack this problem, we believe that it would be interesting to study the performances of our algorithm on random temporal graphs. Notice, however, that there is yet no consensus on how to generate random temporal graphs or which features to select to make the random selection on, see for instance [50] and references within. Moreover, an interesting future research line is to explore the extension and application of our approach (by still referring to [32]) to the case in which temporal edges have a traveling time. Finally, it would be worth exploring the possibility of applying to the temporal closeness the approach of [10,51] for static graphs, which basically consists of executing breadth-first searches starting from all the nodes of the graph and in "cutting" this visits as soon as it can be deduced (by using some appropriate bounds on the static closeness value) that the source of the visit is not among the top-k.

Appendix A. Further Experiments
In the following tables we will show our experimental test-bed and the results we obtained for all the graphs. In Table A1, we report the full list of our graphs, with their number of nodes and edges. Moreover, consistently with respect to Table 1, we also report the running time of EXACT and the average running time of APX-1024 among 50 experiments. Recall that the running times of APX-h, for any other value of h, can be obtained as h · t/1024, where t is the running time of APX-1024.
In Table A2, we report the average RE (together with the coefficient of variation) achieved by APX-256, APX-512, APX-1024, for the top-100 nodes, according to the exact ranking, and for the remaining nodes. The results largely confirm what we have shown in Figure 5, namely that the RE for top-nodes is almost always very small if compared to the RE of all the other nodes. The graphs in the upper part are undirected.
In Table A3, consistently with respect to Figure 6, we show the average Kendall's τ for all the graphs, comparing the ranking found by APX-h for h = 32, 64, 128, 256, 512, 1024 with the exact ranking (except for the twitter graph, where we refer to the ranking computed by APX-1024).
Finally, in Table A4, similarly to Table 2, we report the maximum position of the top-k nodes (for the exact ranking) in the approximate ranking computed by APX-h (over 50 experiments), with k = 1, 5, 10, 20, 100 and h = 256, 512, 1024. As we have estimated, obtaining the EXACT ranking and closeness of twitter requires more than three years. For this reason, we were not able to provide these results for this graph, so that it has been excluded from Tables A2 and A4.