# Finding Top-k Nodes for Temporal Closeness in Large Temporal Graphs

^{1}

^{2}

^{3}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

#### 1.1. Our Results

#### 1.2. Other Related Work

#### 1.3. Structure of the Paper

#### 1.4. Definitions and Notations

**Definition**

**1.**

**Definition**

**2.**

**Definition**

**3.**

#### An Example

## 2. Computing the Closeness

**Lemma**

**1.**

Algorithm 1: Algorithm for computing the closeness of a node |

**Proof.**

We now prove by induction on k that $\mathcal{A}\left(k\right)$ is true for any k with $0\le k\le \left|E\right|$.For any $u\in V$ with $u\ne s$, let ${\Xi}_{u}^{k}=\langle {\tau}_{u,0},{\tau}_{u,1},\dots ,{\tau}_{u,{h}_{u}^{k}}\rangle $ be the prefix of ${\Xi}_{u}$ containing the triples assigned to $\tau \left[u\right]$ at line 4 with $y=u$ after having read k edges. The intervals $({l}_{u,i},{r}_{u,i}]$, for $0\le i\le {h}_{u}^{k}$, form a partition of the interval $({t}_{\alpha}-2,{r}_{u,{h}_{u}^{k}}]$, and, for any $t\in [{t}_{\alpha},{r}_{u,{h}_{u}^{k}}]$, if $t\in ({l}_{u,i},{r}_{u,i}]$ then ${d}_{t}(s,u)={a}_{u,i}-t+1$.

**Base case**. $k=0$. In this case, no edge has been read yet and, hence, line 4 has never been executed with $y=u$. We then have that, for any $u\in V$ with $u\ne s$, ${h}_{u}^{0}=0$, ${\Xi}_{u}^{0}=\langle {\tau}_{u,0}\rangle $ with ${\tau}_{u,0}=({t}_{\alpha}-2,{t}_{\alpha}-1,\infty )$, and, hence, $({l}_{u,0},{r}_{u,0}]=({t}_{\alpha}-2,{t}_{\alpha}-1]=({t}_{\alpha}-2,{r}_{u,{h}_{u}^{0}}]$. Moreover, the interval $[{t}_{\alpha},{r}_{u,{h}_{u}^{0}}]=[{t}_{\alpha},{t}_{\alpha}-1]$ is empty and the condition on the t-distances is “vacuosly” true. Hence, $\mathcal{A}\left(0\right)$ is true.

**Induction step**. Given k with $1\le k\le \left|E\right|$, suppose that $\mathcal{A}(k-1)$ is true. We now prove that $\mathcal{A}\left(k\right)$ is also true. Let $e=(x,y,t)$ be the k-th temporal edge read by the algorithm. Clearly, this edge has no influence on any other node than y (since the graph is directed). Hence, we have just to prove that the value of $\tau \left[y\right]$ is correctly updated. By the induction hypothesis, we know that the current value of $\tau \left[y\right]=({l}_{y},{r}_{y},{a}_{y})$ is such that, for any ${t}^{\prime}\in [{t}_{\alpha},{r}_{y}]$, the ending time of any earliest arrival ${t}^{\prime}$-path from s to y is at most ${a}_{y}<t$. Hence, the edge e cannot improve these ending times since its appearing time is t. Analogously, we know that the current value of $\tau \left[x\right]=({l}_{x},{r}_{x},{a}_{x})$ is such that ${a}_{x}<t$ is the ending time of any earliest arrival ${t}^{\prime}$-path from s to x with ${t}^{\prime}\in ({l}_{x},{r}_{x}]$. If ${r}_{x}\le {r}_{y}$, the edge e does not add any information for the node y, since we already know the ending time of any earliest arrival ${t}^{\prime}$-path from s to y, for any ${t}^{\prime}\le {r}_{y}$. On the contrary (see the left part of Figure 3), if ${r}_{x}>{r}_{y}$, then, for any time instant ${t}^{\prime}\in ({r}_{y},{r}_{x}]$ (for which we did not know yet the corresponding ending time of any earliest arrival ${t}^{\prime}$-path from s to y), we can now say that we can first reach x (at time ${a}_{x}$ with ${r}_{x}\le {a}_{x}<t$), and then wait until the temporal edge e appears to move to y at time t: hence, for all these time instants, the earliest arrival time at y can now be set equal to t, that is, the value of $\tau \left[y\right]$ becomes $({r}_{y},{r}_{x},t)$ (note that subsequent edges cannot improve this value since their appearing times are greater than t). Hence, if ${\Xi}_{y}^{k-1}=\langle {\tau}_{y,0},{\tau}_{y,1},\dots ,{\tau}_{u,{h}_{y}^{k-1}}\rangle $, we have that ${\Xi}_{y}^{k}=\langle {\tau}_{y,0},{\tau}_{y,1},\dots ,{\tau}_{u,{h}_{y}^{k}}\rangle $ with ${h}_{y}^{k}={h}_{y}^{k-1}+1$ and ${\tau}_{u,{h}_{y}^{k}}=({r}_{y},{r}_{x},t)$. By induction hypothesis, the intervals $({l}_{y,i},{r}_{y,i}]$, for $0\le i\le {h}_{y}^{k-1}$, form a partition of the interval $({t}_{\alpha}-2,{r}_{y,{h}_{y}^{k-1}}]$: by adding the triple $({r}_{y},{r}_{x},t)$, we obtain a partition of the interval $({t}_{\alpha}-2,{r}_{y,{h}_{y}^{k}}]$ (since ${r}_{y}={r}_{y,{h}_{y}^{k-1}}$ and ${r}_{x}={r}_{y,{h}_{y}^{k}}$). From the previous argument, it also follows that, for any ${t}^{\prime}\in [{t}_{\alpha},{r}_{y,{h}_{y}^{k}}]$, if ${t}^{\prime}\in ({l}_{y,i},{r}_{y,i}]$ then ${d}_{{t}^{\prime}}(s,y)={a}_{y,i}-{t}^{\prime}+1$. We have thus proved that $\mathcal{A}\left(k\right)$ is satisfied.

**Theorem**

**1.**

**Proof.**

## 3. Approximating the Closeness

**Definition**

**4.**

**Definition**

**5.**

Algorithm 2: Algorithm for computing the closeness contribution of a node to all the others |

**Lemma**

**2.**

**Proof.**

We now prove by induction on k that $\mathcal{S}\left(k\right)$ is true for any k with $0\le k\le \left|E\right|$.For any $u\in V$ with $u\ne s$, let ${\Xi}_{u}^{k}=\langle {\tau}_{u,{h}_{u}^{k}},\dots ,{\tau}_{u,{h}_{u}+1}\rangle $ be the suffix of ${\Xi}_{u}$ containing the triples assigned to $\tau \left[u\right]$ at line 4 with $x=u$ after having read k edges. The intervals $[{l}_{u,i},{r}_{u,i})$, for ${h}_{u}^{k}\le i\le {h}_{u}+1$, form a partition of the interval $[{l}_{u,{h}_{u}^{k}},{t}_{\omega}+2)$, and, for any $t\in [{l}_{u,{h}_{u}^{k}},{t}_{\omega}]$, if ${s}_{u,i}<t\le {s}_{u,i+1}$ then ${d}_{t}(u,d)={r}_{u,i}-t+1$.

**Base case**. $k=0$. In this case, no edge has been read yet and, hence, line 4 has never been executed with $x=u$. We then have that, for any $u\in V$ with $u\ne d$, ${h}_{u}^{0}={h}_{u}+1$, ${\Xi}_{u}^{0}=\langle {\tau}_{u,{h}_{u}+1}\rangle $ with ${\tau}_{u,{h}_{u}+1}=({t}_{\omega}+1,{t}_{\omega}+2,\infty )$, and, hence, $[{l}_{u,{h}_{u}+1},{r}_{u,{h}_{u}+1})=[{t}_{\omega}+1,{t}_{\omega}+2)=[{l}_{u,{h}_{u}^{0}},{t}_{\omega}+2)$. Moreover, the interval $[{l}_{u,{h}_{u}^{0}},{t}_{\omega}]=[{t}_{\omega}+1,{t}_{\omega}]$ is empty and the condition on the t-distances is “vacuosly” true. Hence, $\mathcal{S}\left(0\right)$ is true.

**Induction step**. Given k with $1\le k\le \left|E\right|$, suppose that $\mathcal{S}(k-1)$ is true. We now prove that $\mathcal{S}\left(k\right)$ is also true. Let $e=(x,y,t)$ be the k-th temporal edge read by the algorithm. Clearly, this edge has no influence on any other node than x (since the graph is directed). Hence, we have just to prove that the value of $\tau \left[x\right]$ is correctly updated. By the induction hypothesis, we know that the current value of $\tau \left[x\right]=({l}_{x},{r}_{x},{s}_{x})$ is such that, for any ${t}^{\prime}\in [{l}_{x},{t}_{\omega}]$, the starting time of any latest starting ${t}^{\prime}$-path from x to d is at least ${s}_{x}>t$. Hence, the edge e cannot improve these starting times since its appearing time is t. Analogously, we know that the current value of $\tau \left[y\right]=({l}_{y},{r}_{y},{s}_{y})$ is such that ${s}_{y}>t$ is the starting time of any latest starting ${t}^{\prime}$-path from y to d with ${t}^{\prime}\in [{l}_{y},{r}_{y})$. If ${l}_{y}\ge {l}_{x}$, the edge e does not add any information for the node x, since we already know the starting time of any latest starting ${t}^{\prime}$-path from x to d, for any ${t}^{\prime}\ge {l}_{x}$. On the contrary (see the right part of Figure 3), if ${l}_{y}<{l}_{x}$, then, for any time instant ${t}^{\prime}\in [{l}_{y},{l}_{x})$ (for which we did not know yet the corresponding latest starting time from x), we can now say that we can first reach y (at time t with $t<{s}_{y}\le {l}_{y}$ by using the temporal edge e), and then wait until starting the path from y to d at time ${s}_{y}$: hence, for all these time instants, the latest starting time at x can now be set equal to t, that is, the value of $\tau \left[x\right]$ becomes $({l}_{y},{l}_{x},t)$ (note that subsequent edges cannot improve this value since their appearing times are smaller than t). Hence, if ${\Xi}_{y}^{k-1}=\langle {\tau}_{x,{h}_{x}^{k-1}},\dots ,{\tau}_{u,{h}_{x}+1}\rangle $, we have that ${\Xi}_{x}^{k}=\langle {\tau}_{x,{h}_{x}^{k}},{\tau}_{x,{h}_{x}^{k-1}},\dots ,{\tau}_{u,{h}_{x}+1}\rangle $ with ${h}_{x}^{k}={h}_{x}^{k-1}-1$ and ${\tau}_{u,{h}_{x}^{k}}=({l}_{y},{l}_{x},t)$. By the induction hypothesis, the intervals $[{l}_{x,i},{r}_{x,i})$, for ${h}_{x}^{k-1}\le i\le {h}_{x}$, form a partition of the interval $[{l}_{x,{h}_{x}^{k-1}},{t}_{\omega}+2)$: by adding the triple $({l}_{y},{l}_{x},t)$, we obtain a partition of the interval $[{l}_{x,{h}_{x}^{k}},{t}_{\omega}+2)$ (since ${l}_{x}={l}_{x,{h}_{x}^{k-1}}$ and ${l}_{y}={r}_{x,{h}_{x}^{k}}$). From the previous argument, it also follows that, for any ${t}^{\prime}\in [{l}_{x,{h}_{x}^{k}},{t}_{\omega}]$, if ${s}_{x,i}<{t}^{\prime}\le {s}_{x,i+1}$ then ${d}_{{t}^{\prime}}(x,d)={r}_{x,i}-{t}^{\prime}+1$. We have thus proved that $\mathcal{S}\left(k\right)$ is satisfied.

**Theorem**

**2.**

**Proof.**

**Definition**

**6.**

**Theorem**

**3.**

**Lemma**

**3.**

**Proof.**

**Theorem**

**4.**

**Proof Theorem 3.**

#### Finding Top-K Nodes

## 4. How to Deal with Multiple Edges

- $t<{s}_{x}\wedge t<{s}_{y}$. In this case, neither x nor y has yet used an edge at time t. Hence, we can update the set of intervals as we did in the case of edges with distinct appearing times. That is, if ${l}_{y}<{l}_{x}$, then add to ${I}_{x}$ the triple $({l}_{y},{l}_{x},t)$.
- $t<{s}_{x}\wedge t={s}_{y}$. In this case, y has already “encountered” an edge at time t. Let $({l}_{y}^{\prime},{r}_{y}^{\prime},{s}_{y}^{\prime})$ be the triple just before $({l}_{y},{r}_{y},{s}_{y})$ in ${I}_{y}$ (note that ${l}_{y}^{\prime}={r}_{y}$ and that $t={s}_{y}<{s}_{y}^{\prime}$). If ${l}_{y}^{\prime}<{l}_{x}$, then we add to ${I}_{x}$ the triple $({l}_{y}^{\prime},{l}_{x},t)$: indeed, since $t<{s}_{y}^{\prime}$, we now know that, to arrive at d in the interval $[{l}_{y}^{\prime},{l}_{x})$, we can start from u at time t (by using the edge e), wait until time ${s}_{y}^{\prime}$, and then follow the journey from y to d.
- $t={s}_{x}\wedge t<{s}_{y}$. In this case, x has already “encountered” an edge at time t. If ${l}_{y}<{l}_{x}$, then we extend to the left the triple of x until ${l}_{y}$: indeed, since ${s}_{x}<{s}_{y}$, we now know that, even to arrive at d in the interval $[{l}_{y},{l}_{x})$, we can start at time t (by using the edge e), wait until time ${s}_{y}$, and then follow the journey from y to d.
- $t={s}_{x}\wedge t={s}_{y}$. In this case, both x and y have already “encountered” an edge at time t. Let $({l}_{y}^{\prime},{r}_{y}^{\prime},{s}_{y}^{\prime})$ be the triple just before $({l}_{y},{r}_{y},{s}_{y})$ in ${I}_{y}$ (note that ${l}_{y}^{\prime}={r}_{y}$ and $t={s}_{y}<{s}_{y}^{\prime}$). Similarly to the previous case, if ${l}_{y}^{\prime}<{l}_{x}$, then we extend to the left the triple of x until ${l}_{y}^{\prime}$.

## 5. Experimental Results

- all, come, fant. Every node corresponds to an actor and two actors are connected by their collaboration in a movie, where the appearing time of an edge is the year of the movie. We use the whole temporal collaboration graph and the ones induced by the comedy and the fantasy genres [45].
- linu. The communication graph of the Linux kernel mailing list. An edge $(u,v,t)$ means that user u sent an email to user v at time t [44].

#### 5.1. Running Times

#### 5.2. Accuracy

- Mean Absolute Error (MAE) in each experiment. Namely, for each experiment, we compute ${\sum}_{v}|{C}^{X}\left(u\right)-C\left(u\right)|/n$, where X is the sample of size h randomly chosen by apx-h. This is guaranteed to be bounded with high probability (see Theorem 3).
- Relative Error (RE), which is defined, for a given node u and for a given sample X, as $|{C}^{X}\left(u\right)-C\left(u\right)|/C\left(u\right)$. We show that, even though we do not have any theoretical guarantee on this error, it is very low when considering nodes which are in the top of the ranking, while it gets bigger for peripheral nodes.

**MAE as a function of the sample size.**Figure 4 shows the behaviour of MAE as a function of the sample size, through box-and-whisker plots, where for each graph, and for each h (X-axis), the Y-axis reports the median (and also minimum, maximum, first and third quartiles) among 50 experiments of the MAEs obtained by running apx-h. For the sake of brevity, we show here just the plots for the graphs come, fbwa, and melb (the behaviour is similar for the other graphs). Clearly, the scale is different due to the different values of the closeness centrality of each graph. For the sake of completeness we report the average closeness of come, fbwa, and melb, which is, respectively $6.1\xb7{10}^{-4}$, $5.4\xb7{10}^{-9}$, and $2.5\xb7{10}^{-5}$. As expected, when increasing the sample size h, the MAE gets consistently lower. In particular, this applies to the median but also to the variability, as we see that the window between the minimum and maximum and also the one between the quartiles reduces. In the case of $h=1024$, if we compare the median of the MAEs with the corresponding average values of closeness for the three graphs we get an error of 8%, 4%, and 6%.

**RE as a function of the ranking.**We now show that the behaviour of the RE of apx-h for all the nodes of each graph depends on their ranking. In particular, given a temporal graph with n nodes, let r be the ranking computed by exact and let $r\left(i\right)$, for any i with $1\le i\le n$, be the node v having position i in the ranking r (smaller i means higher closeness). For each i, we compute the mean and the maximum RE over 50 experiments of apx-h when estimating the closeness of the node $v=r\left(i\right)$: in the following, we denote by $\mu \mathrm{RE}\left(i\right)$ and $\mathrm{mRE}\left(i\right)$ these two values. Figure 5 reports, for each ranking position i, the maximum $\mu \mathrm{RE}\left(i\right)$ and $\mathrm{mRE}\left(i\right)$ of apx-1024 among all the nodes with position up to i, for the graphs come, fbwa, and melb (from top to bottom). More specifically, the black plots depict the behavior of ${max}_{1\le j\le i}\mu \mathrm{RE}\left(j\right)$, while the red dashed plots depict the behavior of ${max}_{1\le j\le i}\mathrm{mRE}\left(j\right)$. As can be seen, both the $\mu \mathrm{RE}\left(i\right)$ and $\mathrm{mRE}\left(i\right)$ are very small for nodes having high closeness value (thus low ranks), while they are larger for nodes having a lower closeness value (thus high ranks). This behavior is quite natural as nodes having lower closeness are less often “backward” reachable from the sample and their closeness is often estimated as zero, or whenever they are “backward” reached by the sample, their closeness is then overestimated. This induces a higher variability in general for their estimation. On the other hand, nodes having higher values of closeness behave more stably with respect to the chosen sample, leading to better estimation. The overall good results are shown by this experiment suggest that apx-h is able to give a very good estimation for the top-k nodes, i.e., the k nodes having higher closeness for a given constant k (see also Table A2 in Appendix A, which shows the difference between the average RE of the top-100 nodes and of the other nodes, with respect to different sizes of the sample). However, it could happen that the closeness of nodes with high rank, because of their possibly higher value of RE, could be overestimated by apx-1024: thus, these nodes could overtake, in the ranking produced by apx-1024, nodes with higher closeness (and, hence, lower rank). We will show in the next section that this is not the case in all the graphs we have considered: intuitively, this phenomenon can be justified by the fact that the closeness of these nodes with high rank and high RE is so small that even a significant overestimation of it does not allow the nodes themselves to climb the top positions.

#### 5.3. Ranking and Finding Top-K Nodes

#### 5.3.1. Ranking Convergence

#### 5.3.2. Computing Top-K

## 6. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Appendix A. Further Experiments

**Table A1.**Our dataset. For each graph we report the number of nodes, the number of temporal edges, and the running times (in seconds) of exact (the cell marked with * is an estimation) and apx-1024 (average among 50 experiments). The running times of apx-h, for any other value of h, can be estimated as $h\xb7t/1024$, where t is the running time of apx-1024.

Undirected Graphs | ||||
---|---|---|---|---|

Name | Nodes | Edges | exact | apx-1024 |

topology | 34,761 | 154,842 | 1649 | 47 |

adult | 12,621 | 109,455 | 878 | 40 |

adventure | 47,763 | 157,492 | 4668 | 50 |

all | 527,535 | 3,152,994 | 484,906 | 941 |

animation | 10,817 | 31,499 | 213 | 20 |

biography | 18,215 | 37,257 | 473 | 24 |

comedy | 162,303 | 666,568 | 29,601 | 203 |

family | 34,464 | 87,331 | 1815 | 33 |

fantasy | 30,801 | 75,492 | 1433 | 30 |

history | 20,016 | 46,028 | 623 | 25 |

music | 16,417 | 36,217 | 346 | 21 |

musical | 21,102 | 66,853 | 971 | 33 |

mystery | 34,787 | 87,086 | 1863 | 33 |

scifi | 24,551 | 54,578 | 916 | 30 |

war | 19,690 | 51,980 | 617 | 27 |

western | 11,344 | 58,230 | 382 | 29 |

Directed Graphs | ||||

Name | Nodes | Edges | exact | apx-1024 |

election | 7119 | 103,675 | 201 | 34 |

46,952 | 876,993 | 12,184 | 264 | |

twitter * | 3,511,241 | 16,438,790 | 97,553,304 | 28,449 |

linux | 6340 | 1,096,440 | 19,313 | 317 |

adelaide | 7548 | 404,300 | 889 | 143 |

belfast | 1917 | 122,693 | 62 | 33 |

berlin | 4601 | 1,048,218 | 1358 | 352 |

bordeaux | 3435 | 236,595 | 231 | 68 |

brisbane | 9645 | 392,805 | 1051 | 110 |

canberra | 2764 | 124,305 | 95 | 35 |

detroit | 5683 | 214,863 | 350 | 63 |

dublin | 4571 | 407,240 | 527 | 117 |

grenoble | 1547 | 114,492 | 46 | 30 |

helsinki | 6986 | 686,457 | 1342 | 196 |

kuopio | 549 | 32,122 | 5 | 8 |

lisbon | 7073 | 526,179 | 1019 | 167 |

luxembourg | 1367 | 186,752 | 70 | 52 |

melbourne | 19,493 | 1,098,227 | 6258 | 380 |

nantes | 2353 | 196,421 | 126 | 55 |

palermo | 2176 | 226,215 | 142 | 66 |

paris | 11,950 | 1,823,872 | 6149 | 550 |

prague | 5147 | 670,423 | 947 | 190 |

rennes | 1407 | 109,075 | 42 | 30 |

rome | 7869 | 1,051,211 | 2451 | 364 |

sydney | 24,063 | 1,265,135 | 8635 | 411 |

toulouse | 3329 | 224,516 | 204 | 63 |

turku | 1850 | 133,512 | 69 | 38 |

venice | 1874 | 118,519 | 59 | 32 |

winnipeg | 5079 | 333,882 | 492 | 99 |

**Table A2.**Average RE (and coefficient of variation) for the top-100 nodes (according to the exact ranking) and for the remaining ones.

$\mathit{\mu}\mathbf{RE}$ of apx-256 | $\mathit{\mu}\mathbf{RE}$ of apx-512 | $\mathit{\mu}\mathbf{RE}$ of apx-1024 | ||||
---|---|---|---|---|---|---|

Name | Top-100 | Others | Top-100 | Others | Top-100 | Others |

topology | 0.145 (0.14) | 0.351 (2.03) | 0.099 (0.09) | 0.323 (2.03) | 0.061 (0.12) | 0.287 (2.15) |

adult | 0.078 (0.11) | 0.671 (1.18) | 0.050 (0.09) | 0.567 (1.23) | 0.034 (0.06) | 0.447 (1.28) |

adventure | 0.087 (0.07) | 1.324 (0.83) | 0.060 (0.06) | 1.259 (0.76) | 0.049 (0.06) | 1.158 (0.75) |

all | 0.055 (0.05) | 1.089 (2.12) | 0.035 (0.04) | 1.044 (1.69) | 0.024 (0.07) | 0.991 (1.41) |

animation | 0.140 (0.05) | 1.429 (0.50) | 0.091 (0.10) | 1.250 (0.51) | 0.057 (0.10) | 1.012 (0.55) |

biography | 0.208 (0.18) | 1.705 (0.42) | 0.171 (0.16) | 1.568 (0.38) | 0.117 (0.15) | 1.378 (0.39) |

comedy | 0.063 (0.05) | 1.237 (1.21) | 0.038 (0.08) | 1.172 (1.03) | 0.038 (0.10) | 1.111 (0.95) |

family | 0.169 (0.06) | 1.723 (0.50) | 0.156 (0.05) | 1.612 (0.43) | 0.100 (0.11) | 1.468 (0.42) |

fantasy | 0.274 (0.09) | 1.701 (0.48) | 0.207 (0.16) | 1.588 (0.43) | 0.129 (0.11) | 1.433 (0.42) |

history | 0.207 (0.11) | 1.644 (0.45) | 0.142 (0.12) | 1.504 (0.42) | 0.103 (0.09) | 1.316 (0.43) |

music | 0.189 (0.09) | 1.714 (0.39) | 0.159 (0.07) | 1.577 (0.35) | 0.104 (0.10) | 1.378 (0.36) |

musical | 0.158 (0.20) | 1.383 (0.61) | 0.100 (0.28) | 1.246 (0.61) | 0.069 (0.25) | 1.082 (0.64) |

mystery | 0.134 (0.11) | 1.582 (0.60) | 0.090 (0.09) | 1.485 (0.54) | 0.066 (0.11) | 1.355 (0.52) |

scifi | 0.204 (0.11) | 1.754 (0.41) | 0.139 (0.11) | 1.639 (0.36) | 0.100 (0.11) | 1.464 (0.35) |

war | 0.150 (0.10) | 1.446 (0.55) | 0.112 (0.09) | 1.300 (0.55) | 0.064 (0.12) | 1.118 (0.58) |

western | 0.070 (0.17) | 0.836 (1.01) | 0.050 (0.14) | 0.730 (1.04) | 0.034 (0.19) | 0.593 (1.08) |

election | 0.124 (0.16) | 0.581 (1.41) | 0.086 (0.17) | 0.503 (1.46) | 0.061 (0.16) | 0.438 (1.53) |

0.075 (0.38) | 0.614 (1.81) | 0.060 (0.25) | 0.557 (1.64) | 0.049 (0.39) | 0.506 (1.61) | |

linux | 0.352 (0.17) | 0.445 (1.95) | 0.208 (0.19) | 0.387 (1.96) | 0.151 (0.23) | 0.344 (2.04) |

adelaide | 0.055 (0.33) | 0.094 (2.88) | 0.042 (0.27) | 0.073 (3.03) | 0.033 (0.28) | 0.057 (3.19) |

belfast | 0.146 (0.34) | 0.186 (1.02) | 0.112 (0.34) | 0.148 (0.98) | 0.084 (0.31) | 0.108 (1.02) |

berlin | 0.069 (0.23) | 0.041 (1.91) | 0.050 (0.19) | 0.030 (2.13) | 0.036 (0.20) | 0.023 (2.57) |

bordeaux | 0.059 (0.25) | 0.112 (2.29) | 0.048 (0.20) | 0.090 (2.34) | 0.035 (0.20) | 0.068 (2.42) |

brisbane | 0.092 (0.33) | 0.176 (2.02) | 0.078 (0.30) | 0.146 (2.07) | 0.059 (0.34) | 0.118 (2.19) |

canberra | 0.118 (0.30) | 0.148 (1.14) | 0.095 (0.32) | 0.121 (1.18) | 0.074 (0.31) | 0.093 (1.20) |

detroit | 0.051 (0.19) | 0.074 (2.30) | 0.037 (0.16) | 0.059 (2.62) | 0.028 (0.20) | 0.041 (2.69) |

dublin | 0.066 (0.27) | 0.186 (2.23) | 0.053 (0.27) | 0.152 (2.30) | 0.041 (0.27) | 0.118 (2.37) |

grenoble | 0.144 (0.21) | 0.340 (1.19) | 0.106 (0.22) | 0.261 (1.23) | 0.076 (0.24) | 0.182 (1.24) |

helsinki | 0.082 (0.25) | 0.122 (2.65) | 0.069 (0.24) | 0.102 (2.76) | 0.051 (0.23) | 0.081 (2.93) |

kuopio | 0.117 (0.32) | 0.205 (1.18) | 0.085 (0.31) | 0.140 (1.15) | 0.058 (0.31) | 0.100 (1.16) |

lisbon | 0.124 (0.20) | 0.198 (1.65) | 0.098 (0.18) | 0.156 (1.85) | 0.071 (0.15) | 0.113 (2.04) |

luxembourg | 0.073 (0.36) | 0.050 (1.53) | 0.055 (0.34) | 0.037 (1.94) | 0.039 (0.30) | 0.026 (1.88) |

melbourne | 0.062 (0.24) | 0.142 (2.26) | 0.051 (0.22) | 0.117 (2.40) | 0.035 (0.25) | 0.095 (2.46) |

nantes | 0.117 (0.25) | 0.188 (1.57) | 0.094 (0.24) | 0.150 (1.68) | 0.067 (0.25) | 0.110 (1.70) |

palermo | 0.066 (0.26) | 0.041 (0.35) | 0.051 (0.27) | 0.030 (0.36) | 0.036 (0.28) | 0.021 (0.36) |

paris | 0.074 (0.25) | 0.290 (1.97) | 0.055 (0.24) | 0.249 (1.99) | 0.043 (0.21) | 0.209 (2.01) |

prague | 0.097 (0.20) | 0.242 (2.03) | 0.081 (0.17) | 0.210 (2.09) | 0.056 (0.19) | 0.164 (2.20) |

rennes | 0.094 (0.19) | 0.130 (1.76) | 0.068 (0.23) | 0.095 (1.84) | 0.048 (0.23) | 0.066 (1.87) |

rome | 0.053 (0.24) | 0.098 (3.07) | 0.040 (0.20) | 0.079 (3.21) | 0.032 (0.21) | 0.062 (3.37) |

sydney | 0.122 (0.28) | 0.276 (1.85) | 0.105 (0.25) | 0.238 (1.94) | 0.081 (0.23) | 0.204 (1.99) |

toulouse | 0.105 (0.28) | 0.157 (1.41) | 0.081 (0.30) | 0.124 (1.47) | 0.064 (0.26) | 0.096 (1.51) |

turku | 0.067 (0.31) | 0.113 (2.45) | 0.046 (0.33) | 0.085 (2.66) | 0.035 (0.36) | 0.062 (2.67) |

venice | 0.131 (0.27) | 0.203 (1.53) | 0.098 (0.30) | 0.156 (1.54) | 0.072 (0.30) | 0.114 (1.62) |

winnipeg | 0.054 (0.36) | 0.039 (2.42) | 0.040 (0.31) | 0.030 (3.29) | 0.032 (0.32) | 0.022 (2.98) |

**Table A3.**Average Kendall’s $\tau $ values for the graphs in our dataset: the Kendall’s $\tau $ is computed by referring to the ranking computed by exact, (the cell marked with * is an estimation) apart from the twitter graph, where we refer to the ranking computed by apx-1024.

Name | apx-32 | apx-64 | apx-128 | apx-256 | apx-512 | apx-1024 |
---|---|---|---|---|---|---|

topology | 0.976 | 0.982 | 0.986 | 0.989 | 0.991 | 0.992 |

adult | 0.920 | 0.936 | 0.958 | 0.968 | 0.974 | 0.979 |

adventure | 0.920 | 0.938 | 0.948 | 0.955 | 0.960 | 0.963 |

all | 0.954 | 0.964 | 0.972 | 0.977 | 0.981 | 0.983 |

animation | 0.854 | 0.890 | 0.906 | 0.914 | 0.920 | 0.925 |

biography | 0.624 | 0.747 | 0.829 | 0.858 | 0.869 | 0.876 |

comedy | 0.941 | 0.954 | 0.961 | 0.968 | 0.973 | 0.975 |

family | 0.717 | 0.811 | 0.864 | 0.879 | 0.890 | 0.898 |

fantasy | 0.631 | 0.757 | 0.820 | 0.864 | 0.883 | 0.894 |

history | 0.646 | 0.761 | 0.823 | 0.867 | 0.880 | 0.891 |

music | 0.613 | 0.760 | 0.829 | 0.851 | 0.861 | 0.866 |

musical | 0.813 | 0.884 | 0.915 | 0.926 | 0.938 | 0.944 |

mystery | 0.808 | 0.860 | 0.901 | 0.917 | 0.925 | 0.929 |

scifi | 0.628 | 0.721 | 0.803 | 0.842 | 0.855 | 0.865 |

war | 0.782 | 0.855 | 0.894 | 0.913 | 0.925 | 0.933 |

western | 0.927 | 0.947 | 0.957 | 0.965 | 0.972 | 0.976 |

election | 0.903 | 0.930 | 0.949 | 0.962 | 0.971 | 0.978 |

0.942 | 0.957 | 0.967 | 0.973 | 0.981 | 0.984 | |

linux | 0.959 | 0.970 | 0.978 | 0.982 | 0.985 | 0.988 |

adelaide | 0.847 | 0.894 | 0.932 | 0.957 | 0.970 | 0.979 |

belfast | 0.721 | 0.768 | 0.795 | 0.863 | 0.906 | 0.935 |

berlin | 0.869 | 0.904 | 0.940 | 0.958 | 0.970 | 0.979 |

bordeaux | 0.725 | 0.797 | 0.874 | 0.918 | 0.944 | 0.962 |

brisbane | 0.872 | 0.908 | 0.952 | 0.971 | 0.980 | 0.985 |

canberra | 0.811 | 0.860 | 0.911 | 0.935 | 0.951 | 0.966 |

detroit | 0.804 | 0.860 | 0.905 | 0.941 | 0.962 | 0.973 |

dublin | 0.805 | 0.865 | 0.902 | 0.935 | 0.957 | 0.970 |

grenoble | 0.698 | 0.731 | 0.778 | 0.845 | 0.907 | 0.941 |

helsinki | 0.843 | 0.873 | 0.901 | 0.926 | 0.951 | 0.967 |

kuopio | 0.759 | 0.806 | 0.859 | 0.898 | 0.927 | 0.949 |

lisbon | 0.845 | 0.872 | 0.909 | 0.938 | 0.954 | 0.968 |

luxembourg | 0.885 | 0.915 | 0.939 | 0.955 | 0.969 | 0.978 |

melbourne | 0.696 | 0.711 | 0.773 | 0.841 | 0.894 | 0.960 |

nantes | 0.710 | 0.751 | 0.813 | 0.880 | 0.927 | 0.949 |

palermo | 0.791 | 0.850 | 0.898 | 0.928 | 0.950 | 0.964 |

paris | 0.696 | 0.736 | 0.794 | 0.891 | 0.944 | 0.966 |

prague | 0.833 | 0.852 | 0.900 | 0.926 | 0.947 | 0.962 |

rennes | 0.754 | 0.759 | 0.842 | 0.899 | 0.934 | 0.955 |

rome | 0.825 | 0.862 | 0.917 | 0.949 | 0.966 | 0.976 |

sydney | 0.831 | 0.870 | 0.898 | 0.941 | 0.965 | 0.977 |

toulouse | 0.758 | 0.770 | 0.825 | 0.890 | 0.943 | 0.959 |

turku | 0.842 | 0.881 | 0.923 | 0.946 | 0.961 | 0.973 |

venice | 0.812 | 0.868 | 0.897 | 0.934 | 0.951 | 0.965 |

winnipeg | 0.867 | 0.915 | 0.947 | 0.963 | 0.973 | 0.981 |

twitter * | 0.637 | 0.857 | 0.922 | 0.959 | 0.973 |

**Table A4.**Maximum position of the top-k nodes (for the exact ranking) in the approximate ranking computed by apx-h (over 50 experiments) in the case of the temporal graphs included in our dataset (excluding twit for which the exact ranking could not be computed).

Name | $\mathit{k}=1$ | $\mathit{k}=5$ | $\mathit{k}=10$ | $\mathit{k}=20$ | $\mathit{k}=100$ | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

$\mathit{h}$ | $\mathit{h}$ | $\mathit{h}$ | $\mathit{h}$ | $\mathit{h}$ | |||||||||||

apx | apx | apx | apx | apx | apx | apx | apx | apx | apx | apx | apx | apx | apx | apx | |

256 | 512 | 1024 | 256 | 512 | 1024 | 256 | 512 | 1024 | 256 | 512 | 1024 | 256 | 512 | 1024 | |

topology | 15 | 4 | 5 | 65 | 22 | 11 | 65 | 33 | 19 | 110 | 74 | 36 | 281 | 167 | 137 |

adult | 2 | 1 | 1 | 9 | 7 | 5 | 17 | 14 | 12 | 47 | 28 | 28 | 219 | 182 | 138 |

adventure | 1 | 0 | 0 | 44 | 29 | 12 | 63 | 42 | 22 | 156 | 94 | 40 | 506 | 500 | 289 |

all | 6 | 8 | 3 | 25 | 19 | 11 | 30 | 32 | 18 | 71 | 67 | 35 | 278 | 232 | 166 |

animation | 7 | 3 | 3 | 33 | 22 | 14 | 38 | 26 | 17 | 66 | 43 | 39 | 273 | 169 | 138 |

biography | 196 | 15 | 14 | 327 | 115 | 48 | 327 | 204 | 51 | 496 | 411 | 310 | 1854 | 831 | 548 |

comedy | 5 | 2 | 1 | 33 | 17 | 17 | 37 | 23 | 20 | 74 | 37 | 36 | 340 | 190 | 173 |

family | 57 | 15 | 7 | 57 | 37 | 14 | 107 | 37 | 20 | 283 | 136 | 71 | 582 | 365 | 270 |

fantasy | 1072 | 382 | 47 | 2117 | 954 | 396 | 2859 | 1262 | 396 | 2957 | 1262 | 396 | 3409 | 1550 | 647 |

history | 31 | 14 | 7 | 67 | 35 | 31 | 123 | 62 | 42 | 237 | 141 | 66 | 942 | 517 | 307 |

music | 59 | 28 | 7 | 59 | 28 | 10 | 69 | 70 | 33 | 190 | 127 | 58 | 986 | 337 | 271 |

musical | 419 | 35 | 7 | 606 | 109 | 34 | 1031 | 179 | 57 | 1630 | 469 | 237 | 1978 | 1293 | 785 |

mystery | 4 | 2 | 2 | 130 | 24 | 18 | 179 | 87 | 50 | 219 | 108 | 77 | 879 | 466 | 515 |

scifi | 48 | 3 | 2 | 96 | 87 | 57 | 582 | 179 | 119 | 582 | 179 | 119 | 2104 | 760 | 305 |

war | 47 | 5 | 3 | 69 | 33 | 12 | 101 | 51 | 28 | 253 | 240 | 84 | 713 | 356 | 219 |

western | 3 | 2 | 1 | 13 | 9 | 6 | 44 | 32 | 21 | 44 | 33 | 25 | 280 | 265 | 172 |

election | 4 | 3 | 3 | 11 | 8 | 7 | 32 | 21 | 17 | 67 | 55 | 40 | 790 | 414 | 243 |

56 | 42 | 30 | 95 | 93 | 51 | 95 | 93 | 51 | 163 | 104 | 94 | 544 | 504 | 474 | |

linux | 5 | 4 | 3 | 79 | 53 | 19 | 143 | 78 | 37 | 175 | 103 | 49 | 279 | 198 | 192 |

adelaide | 14 | 8 | 4 | 32 | 18 | 12 | 49 | 28 | 22 | 184 | 132 | 84 | 389 | 310 | 222 |

belfast | 65 | 50 | 11 | 107 | 80 | 42 | 149 | 102 | 76 | 189 | 148 | 126 | 405 | 372 | 381 |

berlin | 58 | 44 | 12 | 79 | 52 | 36 | 120 | 89 | 61 | 152 | 118 | 108 | 312 | 226 | 198 |

bordeaux | 30 | 13 | 8 | 46 | 32 | 18 | 111 | 88 | 57 | 280 | 227 | 109 | 671 | 462 | 338 |

brisbane | 48 | 44 | 15 | 129 | 78 | 62 | 129 | 78 | 62 | 162 | 112 | 85 | 411 | 257 | 224 |

canberra | 44 | 28 | 22 | 49 | 33 | 29 | 110 | 87 | 72 | 110 | 108 | 93 | 496 | 464 | 360 |

detroit | 8 | 2 | 3 | 31 | 11 | 8 | 40 | 21 | 19 | 78 | 47 | 39 | 284 | 202 | 173 |

dublin | 42 | 19 | 7 | 71 | 44 | 35 | 71 | 60 | 42 | 121 | 76 | 61 | 301 | 261 | 231 |

grenoble | 89 | 71 | 29 | 130 | 71 | 53 | 169 | 105 | 87 | 192 | 157 | 110 | 421 | 383 | 322 |

helsinki | 78 | 59 | 42 | 163 | 113 | 80 | 449 | 264 | 235 | 449 | 264 | 235 | 583 | 475 | 387 |

kuopio | 26 | 13 | 14 | 71 | 44 | 21 | 71 | 48 | 32 | 82 | 70 | 61 | 197 | 181 | 144 |

lisbon | 106 | 73 | 48 | 219 | 113 | 72 | 228 | 173 | 105 | 313 | 232 | 141 | 682 | 491 | 426 |

luxembourg | 11 | 10 | 6 | 16 | 10 | 10 | 27 | 20 | 17 | 69 | 60 | 50 | 202 | 167 | 143 |

melbourne | 124 | 58 | 15 | 256 | 116 | 51 | 256 | 116 | 66 | 306 | 185 | 125 | 1467 | 1157 | 706 |

nantes | 43 | 16 | 8 | 94 | 48 | 49 | 117 | 60 | 49 | 162 | 111 | 73 | 415 | 364 | 339 |

palermo | 29 | 21 | 20 | 67 | 31 | 24 | 67 | 49 | 38 | 122 | 84 | 69 | 316 | 341 | 218 |

paris | 116 | 65 | 50 | 190 | 91 | 72 | 208 | 114 | 72 | 298 | 226 | 148 | 530 | 417 | 380 |

prague | 190 | 147 | 75 | 190 | 147 | 83 | 190 | 147 | 120 | 326 | 230 | 173 | 450 | 439 | 368 |

rennes | 31 | 20 | 6 | 62 | 33 | 20 | 152 | 92 | 78 | 152 | 101 | 80 | 352 | 302 | 284 |

rome | 28 | 15 | 13 | 36 | 30 | 25 | 48 | 44 | 32 | 84 | 76 | 49 | 359 | 244 | 180 |

sydney | 96 | 55 | 27 | 128 | 81 | 62 | 128 | 112 | 73 | 191 | 140 | 102 | 1028 | 645 | 513 |

toulouse | 57 | 28 | 18 | 77 | 41 | 40 | 88 | 57 | 47 | 131 | 89 | 79 | 309 | 264 | 250 |

turku | 9 | 6 | 4 | 22 | 17 | 13 | 63 | 41 | 33 | 63 | 45 | 33 | 366 | 233 | 226 |

venice | 28 | 25 | 10 | 58 | 43 | 29 | 70 | 60 | 42 | 133 | 126 | 97 | 268 | 227 | 207 |

winnipeg | 10 | 7 | 5 | 20 | 15 | 15 | 26 | 24 | 22 | 45 | 39 | 34 | 321 | 241 | 208 |

## References

- Bavelas, A. A Mathematical Model for Group Structures. Appl. Anthropol.
**1948**, 7, 16–30. [Google Scholar] [CrossRef] - Freeman, L.C. Centrality in social networks conceptual clarification. Soc. Netw.
**1978**, 1, 215–239. [Google Scholar] [CrossRef] [Green Version] - Brandes, U. A Faster Algorithm for Betweenness Centrality. J. Math. Sociol.
**2001**, 25, 163–177. [Google Scholar] [CrossRef] - Eppstein, D.; Wang, J. Fast Approximation of Centrality. J. Graph Alg. Appl.
**2004**, 8, 39–45. [Google Scholar] [CrossRef] [Green Version] - Koschützki, D.; Lehmann, K.A.; Peeters, L.; Richter, S.; Tenfelde-Podehl, D.; Zlotowski, O. Centrality Indices. In Network Analysis: Methodological Foundations; Brandes, U., Erlebach, T., Eds.; Springer: Berlin/Heidelberg, Germany, 2005; pp. 16–61. [Google Scholar]
- Schoch, D.; Brandes, U. Re-conceptualizing centrality in social networks. Eur. J. Appl. Math.
**2016**, 27, 971–985. [Google Scholar] [CrossRef] [Green Version] - David Schoch. Periodic Table of Network Centrality. Available online: http://schochastics.net/sna/periodic.html (accessed on 27 August 2020).
- Lin, N. Foundations of Social Research; McGraw-Hill: New York, NY, USA, 1976. [Google Scholar]
- Marchiori, M.; Latora, V. Harmony in the small-world. Phys. A Stat. Mech. Its Appl.
**2000**, 285, 539–546. [Google Scholar] [CrossRef] [Green Version] - Bergamini, E.; Borassi, M.; Crescenzi, P.; Marino, A.; Meyerhenke, H. Computing Top-K Closeness Cent. Faster Unweighted Graphs. ACM Trans. Knowl. Discov. Data
**2019**, 13, 1–40. [Google Scholar] [CrossRef] [Green Version] - Calabro, C.; Impagliazzo, R.; Paturi, R. The Complexity of Satisfiability of Small Depth Circuits. In Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2009; Volume 5917, pp. 75–85. [Google Scholar]
- Cohen, E.; Delling, D.; Pajor, T.; Werneck, R.F. Computing Classic Closeness Centrality, at Scale. In Proceedings of the Second ACM Conference on Online Social Networks, Dublin, Ireland, 1–2 October 2014; pp. 37–50. [Google Scholar]
- NetworKit. Available online: https://networkit.github.io (accessed on 27 August 2020).
- SageMath. Available online: http://www.sagemath.org (accessed on 27 August 2020).
- Holme, P.; Saramäki, J. Temporal networks. Phys. Rep.
**2012**, 519, 97–125. [Google Scholar] [CrossRef] [Green Version] - Latapy, M.; Viard, T.; Magnien, C. Stream graphs and link streams for the modeling of interactions over time. Soc. Netw. Anal. Min.
**2018**, 8, 61. [Google Scholar] [CrossRef] [Green Version] - Michail, O. An Introduction to Temporal Graphs: An Algorithmic Perspective. Internet Math.
**2016**, 12, 239–280. [Google Scholar] [CrossRef] [Green Version] - Tang, J.K.; Musolesi, M.; Mascolo, C.; Latora, V.; Nicosia, V. Analysing information flows and key mediators through temporal centrality metrics. In Proceedings of the 3rd Workshop on Social Network Systems, Paris, France, 22–25 June 2010; p. 3. [Google Scholar]
- Santoro, N.; Quattrociocchi, W.; Flocchini, P.; Casteigts, A.; Amblard, F. Time-Varying Graphs and Social Network Analysis: Temporal Indicators and Metrics. arXiv
**2011**, arXiv:1102.0629. [Google Scholar] - Kim, H.; Anderson, R. Temporal node centrality in complex networks. Phys. Rev. E
**2012**, 85, 026107. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Gao, Z.; Shi, Y.; Chen, S. Measures of node centrality in mobile social networks. Int. J. Mod. Phys. C
**2015**, 26, 1550107. [Google Scholar] [CrossRef] - Magnien, C.; Tarissan, F. Time Evolution of the Importance of Nodes in Dynamic Networks. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Paris, France, 25–28 August 2015; pp. 1200–1207. [Google Scholar]
- Pereira, F.S.F.; de Amo, S.; Gama, J. Evolving Centralities in Temporal Graphs: A Twitter Network Analysis. In Proceedings of the 2016 17th IEEE International Conference on Mobile Data Management (MDM), Porto, Portugal, 13–16 June 2016; pp. 43–48. [Google Scholar]
- Pereira, F.S.F.; de Amo, S.; Gama, J. Detecting Events in Evolving Social Networks through Node Centrality Analysis. In CEUR Workshop Proceedings; STREAMEVOLV@ ECML-PKDD: Ghent, Belgium, 2016; Volume 2069. [Google Scholar]
- Williams, M.J.; Musolesi, M. Spatio-temporal networks: Reachability, centrality and robustness. R. Soc. Open Sci.
**2016**, 3, 160196. [Google Scholar] [CrossRef] [Green Version] - Cordeiro, M.; Sarmento, R.; Brazdil, P.; Gama, J. Evolving Networks and Social Network Analysis Methods and Techniques. In Social Media and Journalism: Trends, Connections, Implications; Višňovský, J., Ed.; IntechOpen: London, UK, 2018. [Google Scholar]
- Ghanem, M.; Magnien, C.; Tarissan, F. Centrality Metrics in Dynamic Networks: A Comparison Study. IEEE Trans. Netw. Sci. Eng.
**2019**, 6, 940–951. [Google Scholar] [CrossRef] [Green Version] - Wu, H.; Huang, Y.; Cheng, J.; Li, J.; Ke, Y. Reachability and time-based path queries in temporal graphs. In Proceedings of the 2016 IEEE 32nd International Conference on Data Engineering (ICDE), Helsinki, Finland, 16–20 May 2016; pp. 145–156. [Google Scholar]
- Kossinets, G.; Kleinberg, J.M.; Watts, D.J. The Structure of Information Pathways in a Social Communication network. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, LA, USA, 24–27 August 2008; pp. 435–443. [Google Scholar]
- Kujala, R.; Weckström, C.; Darst, R.; Madlenocić, M.; Saramäki, J. A collection of public transport network data sets for 25 cities. Sci. Data
**2018**, 5, 180089. [Google Scholar] [CrossRef] [Green Version] - Crescenzi, P.; Magnien, C.; Marino, A. Approximating the Temporal Neighbourhood Function of Large Temporal Graphs. Algorithms
**2019**, 12, 211. [Google Scholar] [CrossRef] [Green Version] - Dibbelt, J.; Pajor, T.; Strasser, B.; Wagner, D. Connection Scan Algorithm. J. Exp. Alg.
**2018**, 23, 1–56. [Google Scholar] [CrossRef] - Tsalouchidou, I.; Baeza-Yates, R.; Bonchi, F.; Liao, K.; Sellis, T. Temporal betweenness centrality in dynamic graphs. Int. J. Data Sci. Anal.
**2020**, 9, 257–272. [Google Scholar] [CrossRef] - Lv, L.; Zhang, K.; Zhang, T.; Bardou, D.; Zhang, J.; Cai, Y. PageRank centrality for temporal networks. Phys. Lett. A
**2019**, 383, 1215–1222. [Google Scholar] [CrossRef] - Falzon, L.; Quintane, E.; Dunn, J.; Robins, G. Embedding time in positions: Temporal measures of centrality for social network analysis. Soc. Netw.
**2018**, 54, 168–178. [Google Scholar] [CrossRef] - Ni, P.; Hanai, M.; Tan, W.J.; Cai, W. Efficient closeness centrality computation in time-evolving graphs. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Vancouver, Canada, 27–30 August 2019; pp. 378–385. [Google Scholar]
- Okamoto, K.; Chen, W.; Li, X. Ranking of Closeness Centrality for Large-Scale Social Networks. In International Workshop on Frontiers in Algorithmics; Springer: Berlin/Heidelberg, Germany, 2008; pp. 186–195. [Google Scholar]
- Merrer, E.L.; Scouarnec, N.L.; Trédan, G. Heuristical top-k: Fast estimation of centralities in complex networks. Inf. Process. Lett.
**2014**, 114, 432–436. [Google Scholar] [CrossRef] - Casteigts, A.; Flocchini, P.; Quattrociocchi, W.; Santoro, N. Time-varying graphs and dynamic networks. Int. J. Parallel Emergent Distrib. Syst.
**2012**, 27, 387–408. [Google Scholar] [CrossRef] - Crescenzi, P.; Grossi, R.; Lanzi, L.; Marino, A. A Comparison of Three Algorithms for Approximating the Distance Distribution in Real-World Graphs. In International Conference on Theory and Practice of Algorithms in (Computer) Systems; TAPAS; Springer: Berlin/Heidelberg, Germany, 2011; pp. 92–103. [Google Scholar]
- Dubhashi, D.P.; Panconesi, A. Concentration of Measure for the Analysis of Randomized Algorithms; Cambridge University Press: Cambridge, UK, 2009. [Google Scholar]
- Zhang, B.; Liu, R.; Massey, D.; Zhang, L. Collecting the Internet AS-level Topology. ACM SIGCOMM Comput. Commun. Rev.
**2005**, 35, 53–61. [Google Scholar] [CrossRef] - Kunegis, J. KONECT: The Koblenz Network Collection. In Proceedings of the 22nd International Conference on World Wide Web, Rio de Janeiro, Brazil, 13–17 May 2013; pp. 1343–1350. [Google Scholar]
- Kunegis, J. The KONECT Project. Available online: http://konect.cc (accessed on 27 August 2020).
- IMDb. IMDb Datasets. Available online: http://www.imdb.com/interfaces (accessed on 27 August 2020).
- Viswanath, B.; Mislove, A.; Cha, M.; Gummadi, K.P. On the Evolution of User Interaction in Facebook. In Proceedings of the 2nd ACM Workshop on Online Social Networks, WOSN, Barcelona, Spain, 17 August 2009; pp. 37–42. [Google Scholar]
- Borra, E.; Rieder, B. Programmed method: Developing a toolset for capturing and analyzing tweets. Aslib J. Inf. Manag.
**2014**, 66, 262–278. [Google Scholar] [CrossRef] - Borra, E.; Rieder, B. Twitter Migrants Network. Available online: http://data.complexnetworks.fr/Migrants/ (accessed on 27 August 2020).
- Vigna, S. A Weighted Correlation Index for Rankings with Ties. In Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, 18–22 May 2015; pp. 1166–1176. [Google Scholar]
- Gauvin, L.; Génois, M.; Karsai, M.; Kivelä, M.; Takaguchi, T.; Valdano, E.; Vestergaard, C.L. Randomized reference models for temporal networks. arXiv
**2018**, arXiv:1806.04032. [Google Scholar] - Olsen, P.W.; Labouseur, A.G.; Hwang, J. Efficient top-k closeness centrality search. In Proceedings of the 2014 IEEE 30th International Conference on Data Engineering, Chicago, IL, USA, 31 March–4 April 2014; pp. 196–207. [Google Scholar]

**Figure 1.**An example of a temporal undirected graph with the three temporal edges $(a,b,2)$, $(a,c,4)$, and $(b,c,1)$ (

**left**) and of the corresponding t-distances (

**right**).

**Figure 2.**The evolution of the t-closeness of Christopher Lee, in red, and Eleanor Parker, in black, (

**left**) and the top-20 nodes in Paris according to the temporal closeness (

**right**).

**Figure 3.**The update rule of the temporal breadth-first search algorithm for computing the closeness of a node s (

**left**) and of its “backward” version for computing the contribution of a node d to the closeness of all the other nodes (

**right**).

**Figure 4.**The mean absolute error of apx-h as a function of h, in the case of the temporal graphs come, fbwa, and melb. For each graph and for each sample size h, the corresponding box-and-whisker plot depicts the Mean Absolute Error (MAE) through its quartiles.

**Figure 5.**Relative error of apx-1024 as a function of rank position for the graphs come, fbwa, and melb. In particular, the horizontal axis corresponds to the position of a node in the exact ranking, while the black (respectively, red dashed) plot indicates the maximum average (respectively, maximum) RE (over 50 experiments) of all the nodes up to that position. The plot is in loglog-scale. Note that there are groups of nodes with very similar relative error: as a result of a preliminary analysis of this phenomenon, we noticed that this is due to the existence of several small cliques disconnected from the rest of the graph.

**Figure 6.**Average Kendall’s $\tau $ values for undirected (left) and directed (right) graphs as a function of the sample size h: the average Kendall’s $\tau $ of APX − h (over 50 experiments) is computed by referring to the ranking computed by exact, except for the twit graph plot, where we refer to the ranking computed by apx-1024 (for this reason, its plot stops at h = 512).

**Figure 7.**Box-and-whisker plots of the maximum position of top 20 nodes in the approximate ranking as a function of the sample size in the case of the temporal graphs come, fbwa, and melb.

**Table 1.**A sample of our dataset. For each graph we report the number of nodes, the number of temporal edges, and the running times (in seconds) of exact (the cell marked with * is an estimation) and apx-1024 (average among 50 experiments). The running times of apx-h, for any other value of h, can be estimated as $h\xb7t/1024$, where t is the running time of apx-1024.

Undirected Graphs | Directed Graphs | ||||||||
---|---|---|---|---|---|---|---|---|---|

Name | Nodes | Edges | exact | apx-1024 | Name | Nodes | Edges | exact | apx-1024 |

fant | 34,464 | 87,331 | 1815 | 33 | melb | 19,493 | 1,098,227 | 6258 | 380 |

topo | 34,761 | 154,842 | 1649 | 47 | fbwa | 46,952 | 876,993 | 12,184 | 264 |

come | 162,303 | 666,568 | 29,601 | 203 | linu | 63,400 | 1,096,400 | 19,313 | 317 |

all | 527,535 | 3,152,994 | 484,906 | 941 | twit | 3,511,241 | 16,438,790 | * 97,553,304 | 28,449 |

**Table 2.**Maximum position of the top-k nodes (for the exact ranking) in the approximate ranking computed by apx-h (over 50 experiments) in the case of the temporal graphs included in our sample dataset (excluding twit for which the exact ranking could not be computed).

Name | $\mathit{k}=1$ | $\mathit{k}=5$ | $\mathit{k}=10$ | $\mathit{k}=20$ | $\mathit{k}=100$ | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

h | h | h | h | h | |||||||||||

256 | 512 | 1024 | 256 | 512 | 1024 | 256 | 512 | 1024 | 256 | 512 | 1024 | 256 | 512 | 1024 | |

fant | 1072 | 382 | 47 | 2117 | 954 | 396 | 2859 | 1262 | 396 | 2957 | 1262 | 396 | 3409 | 1550 | 647 |

topo | 15 | 4 | 5 | 65 | 22 | 11 | 65 | 33 | 19 | 110 | 74 | 36 | 281 | 167 | 137 |

come | 5 | 2 | 1 | 33 | 17 | 17 | 37 | 23 | 20 | 74 | 37 | 36 | 340 | 190 | 173 |

all | 6 | 8 | 3 | 25 | 19 | 11 | 30 | 32 | 18 | 71 | 67 | 35 | 278 | 232 | 166 |

fbwa | 56 | 42 | 30 | 95 | 93 | 51 | 95 | 93 | 51 | 163 | 104 | 94 | 544 | 504 | 474 |

linu | 5 | 4 | 3 | 79 | 53 | 19 | 143 | 78 | 37 | 175 | 103 | 49 | 279 | 198 | 192 |

melb | 124 | 58 | 15 | 256 | 116 | 51 | 256 | 116 | 66 | 306 | 185 | 125 | 1467 | 1157 | 706 |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Crescenzi, P.; Magnien, C.; Marino, A.
Finding Top-*k* Nodes for Temporal Closeness in Large Temporal Graphs. *Algorithms* **2020**, *13*, 211.
https://doi.org/10.3390/a13090211

**AMA Style**

Crescenzi P, Magnien C, Marino A.
Finding Top-*k* Nodes for Temporal Closeness in Large Temporal Graphs. *Algorithms*. 2020; 13(9):211.
https://doi.org/10.3390/a13090211

**Chicago/Turabian Style**

Crescenzi, Pierluigi, Clémence Magnien, and Andrea Marino.
2020. "Finding Top-*k* Nodes for Temporal Closeness in Large Temporal Graphs" *Algorithms* 13, no. 9: 211.
https://doi.org/10.3390/a13090211