Approximating the Temporal Neighbourhood Function of Large Temporal Graphs

: Temporal networks are graphs in which edges have temporal labels, specifying their starting times and their traversal times. Several notions of distances between two nodes in a temporal network can be analyzed, by referring, for example, to the earliest arrival time or to the latest starting time of a temporal path connecting the two nodes. In this paper we mostly refer to the notion of temporal reachability by using the earliest arrival time. In particular, we ﬁrst show how the sketch approach, which has been already used in the case of classical graphs, can be applied to the case of temporal networks in order to approximately compute the sizes of the temporal cones of a temporal network. By making use of this approach, we subsequently show how we can approximate the temporal neighborhood function (that is, the number of pairs of nodes reachable from one another in a given time interval) of large temporal networks in a few seconds. Finally, we apply our algorithm in order to analyze and compare the behavior of 25 public transportation temporal networks. Our results can be easily adapted to the case in which we want to refer to the notion of distance based on the latest starting time.


Introduction
Temporal networks are graphs in which nodes and edges can appear or disappear over time, due not only to failures or malfunctioning of the entities participating to the system represented by the temporal graph, but mostly to the "normal" behaviour of the system itself.A typical temporal network is a person-to-person communication network within a company.In such a network, for example, nodes can appear or disappear (depending on the recruitment policy of the company), and edges appear whenever an employee of the company sends an e-mail message to another employee of the company.In this paper, we will focus on temporal networks in which the set of nodes does not change over time (at least over a specified interval of time).Moreover, we consider only the case in which edges are available at discrete time instants, so that the dynamics of the network is specified only by the appearance times of the edges.
As observed in [1], temporal networks can model a great variety of systems in nature, society and technology.Several different types of temporal networks have, indeed, been analyzed: person-to-person communication networks (such as e-mails or phone calls), one-to-many information spreading networks (such as Twitter interactions), contact networks (such as cell phone proximity detections), biological networks (such as protein interactions), distributed computing networks (such as autonomous system communications), infrastructure networks (such as public transport timetables), and many others.It is worth observing, however, that different names have been used for denoting temporal networks (even though the basic notion was almost the same), such as, for example, dynamic networks [2], time-varying graphs [3], evolving networks [4], and link streams [5].In this paper we are interested in studying reachability properties of temporal networks.As already observed in [6], if we just remove all time information from the temporal graph (and collapse, if necessary, multiple edges between any two vertices into a single edge), we clearly loose all the temporal information of the graph.This loss can be critical to the understanding of the reachability relationships between the nodes of the graph.For example, in the left part of Figure 1 a temporal network with five nodes and five temporal edges is represented, while in the right part of the figure the "non-temporal" version of the graph (in which all edge temporal labels have been removed) is shown.It is easy to verify that the two simple paths from node 1 to node 2 in the non-temporal graph do not exist in the temporal network (a temporal path is, intuitively, a path such that each edge in the path appears later than the edges preceding it in the path).Indeed, the edge from node 3 to node 2 appears at time 1, hence it cannot be used within a path starting from node 1, since all this node's edges appear after that time.Moreover, the path of length 2 from node 1 to node 5 in the non-temporal graph does not exist in the temporal graph since it is only possible to reach node 3 from node 1 in one step at time 5, while the edge from node 3 to node 5 appears at time 4. In summary, removing temporal information may let us conclude that two nodes (i.e. 1 and 2) are reachable, while they are not, or that the length of the shortest path between two nodes (i.e. 1 and 5) is smaller than it really is.
For this reason, we are interested in developing algorithmic techniques which allow us to efficiently compute aggregate information about temporal paths and time-distances between pairs of nodes.In particular, we focus on the temporal neighborhood function (in short, TNF) of a temporal network, which is the natural extension of the neighborhood function already widely analyzed in the case of non-temporal graphs [7,8].More precisely, given a time interval I = [t α , t ω ], the temporal neighborhood function returns the value |N (I)|, where N (I) denotes the set of pairs of nodes (u, v) such that there exists a temporal path from u to v, which starts from u not earlier than t α , arrives in v no later than t ω , and it is such that each edge appears later than the edges preceding it.
By assuming that a temporal network is represented by the sequence of its temporal edges, ordered in non-decreasing order with respect to their appearance times, the temporal neighborhood function can be easily computed by making use of the following "scan-based" algorithm [9], which allows us to compute the cardinality of the temporal cone of a node s, which is the set of nodes reachable from s in the interval [t α , t ω ].

Return the number of elements of t different from ∞.
If m is the number of temporal edges, it is easy to verify that the complexity of the above algorithm is O(m).In order to compute the temporal neighborhood function, we have to execute the above procedure starting from every node of the temporal network.We refer to this algorithm as ETNF (Exact TNF).Hence, if n denotes the number of nodes, we can conclude that ETNF runs in time O(nm).
Unfortunately, this time complexity is not acceptable when dealing with real-world temporal networks, where there are millions of nodes and billions of temporal edges.For this reason, in this paper we propose a new algorithm called ATNF for approximating the temporal neighborhood function of a temporal network.

Our results
In order to approximate the temporal neighborhood function, we first describe a simple dynamic programming algorithm for computing the reverse temporal cone of a node s, which is the set of nodes that can reach s in a specific interval [t α , t ω ].Note that, if we can compute the cardinalities of the reverse temporal cones, then we can also compute the temporal neighborhood function.We then show how the sketch approach, which has been already widely used in the case of non-temporal graphs [10][11][12][13], can be applied to the case of temporal networks in order to approximately compute the cardinalities of the reverse temporal cones of a temporal network.More specifically, the resulting approximation algorithm, called ATNF (Approximated TNF), has relative error bounded by with high probability, whenever the sketches have size k = Θ log n 2 .The time complexity of the algorithm is O(km).We then experimentally evaluate the quality of the approximation performed by ATNF by comparing the approximate value of the temporal neighborhood function with the exact one (computed by making use of the scan-based exact algorithm described in the previous section) on a data-set containing several medium-size temporal networks.As a matter of fact (and as expected), the obtained approximation is much better than the one guaranteed in theory, even when the size of the sketches is significantly smaller than the required size.By making use of the approximation algorithm, we hence show how we can accurately approximate, in a few minutes, the temporal neighborhood function (and, hence, the distance distribution) of two large temporal networks: the IMDB collaboration network (which is undirected) and the Twitter re-tweets network (which is directed).The first network contains more than half a million nodes and more than three millions edges, while the second network contains more than two millions nodes and more than sixteen millions edges.
Finally, we apply ATNF in order to analyze and compare the behavior of twenty-five public transportation temporal networks [14].In particular, we analyze the reachability properties of these networks, by computing the values of the temporal neighborhood function in different intervals during a day.As a matter of fact, we observe that there are cities which perform significantly better than others with respect to these reachability properties, and that this result cannot always been deduced by simply looking at the density of the temporal network.Moreover, we show that the quality of the approximation is preserved even with small values of the sketch sizes: this allowed us to perform the entire experimental evaluation for all the cities in less than one hour (while the exact algorithm would have roughly required almost four days).

Related work
Algorithms for distance computation in temporal networks have been proposed in [6] and [15].[16] uses a similar algorithm to [6] for computing spanning trees.[9] proposes a method for indexing data and answer reachability and path queries in temporal networks.The problem of finding different types of path (minimum traveling time, earliest arrival time, and minimum number of hops) in temporal network has also been investigated in a distributed context [17].
A problem related to the one analyzed in this paper is computing distances in road networks, where edges have a traversal time but are supposed to exist at all times.In this case, the focus is mainly on pre-computing some information in order to be able to answer fastest path queries very quickly (see, for example, [18,19]).[20] focuses on the impact of the passenger demand on the performance of path finding algorithms.Concerning public transportation networks, many works have focused on the efficient design of such networks, by considering, e.g., where to place hubs, see for instance [21,22], or on their robustness or resilience, by studying how perturbations on specific nodes or links, or changes in demand, affect the whole network, see for instance [23,24].Other works have used connectivity notions to evaluate the importance of nodes in transportation networks, see for instance [25] and references within.

Structure of the paper
In Section 2, we give all basic definitions and notations concerning temporal networks, paths, cones and neighborhood functions.In Section 3, we describe the dynamic programming algorithm for computing reverse temporal cones, and we show how the bottom-k sketch approach can be used in order to approximate the cardinalities of these cones (and, hence, the temporal neighborhood function), getting our approximation algorithm ATNF.In Section 4 we experimentally evaluate the quality of the approximation of ATNF and its running time, comparing our results with the ones of ETNF.Moreover, we use the algorithm itself to compute the temporal neighborhood functions of two large temporal networks.Finally, in Section 5, we apply ATNF for comparing the reachability properties of twenty-five public transportation networks.

Definitions and notations
The following definitions are mostly inspired by [3,5,9].A temporal graph is a pair G = (V, E), where V is the set of nodes and E is the set of temporal edges.A temporal edge e ∈ E is a triple (u, v, t), where u, v ∈ V are the source and destination nodes of the edge, respectively, and t ∈ N is its appearing time.When the temporal edges are bidirectional, then (u, v, t) can be also written as (v, u, t).The time horizon T (G) of a temporal graph G is the union of all the appearing times of its temporal edges.For instance, in the left part of Figure 2 a temporal graph with 5 nodes and 12 temporal edges is shown: its time horizon is {1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12}.A different way of representing a temporal graph is the link stream diagram shown in the right part of Figure 2, where the labels on the left of the diagram represent the nodes of the graph while the labels above the diagram represent the different time instants.In this paper, we will indeed assume that the edges are given to the algorithms one after the other (similarly to the streaming model) in non-decreasing order with respect to the appearing time: this corresponds to reading the edges in the link stream diagram from left-to right.v = v k , and, for each i with 1

Temporal paths
The length of a temporal path is the number of temporal edges it contains.For example, referring to the temporal graph of Figure 2, (1, 4, 1), (4, 2, 5), (2,5,11), (5,3,12) is a temporal path of length 4. On the contrary, (1, 4, 1), (4, 2, 5), (2, 3, 2) is not a temporal path, since the appearing time of the second edge is larger than the appearing time of the third edge.The starting time (respectively, ending time) of a temporal path P, denoted by σ(P) (respectively, η(P)), is equal to the appearing time of the first (respectively, last) edge in the path.For instance, if P = (1, 4, 1), (4, 2, 5), (2,5,11), (5,3,12), then σ(P) = 1 and η(P) = 12.Given a time interval I = [t α , t ω ] and two nodes u and v, we will denote by P (u, v, I) the set of all temporal paths P from u to v such that t α ≤ σ(P) ≤ η(P) ≤ t ω .Among all temporal paths between two nodes in a given time interval, in this paper we will distinguish the ones which allows us to arrive as early as possible.
Definition 1.Given a temporal graph G = (V, E), two nodes u and v in V, and a time interval I = [t α , t ω ], a path P ∈ P (u, v, I) is said to be an earliest arrival path if η(P) = min{η(P ) : P ∈ P (u, v, I)}.
For example, the left part of Figure 3 shows the earliest arrival path from node 1 to node 3 in the time interval [1,12] in the temporal graph of Figure 2: this path consists of the temporal edges (1,4,1), (4,5,3), and (5,3,4), its starting time is 1, and its ending time is 4. The right part of the figure, instead, shows an earliest arrival path from node 1 to node 3 in the time interval [2,12]: it consists of the temporal edges (1,4,6) and (4,3,9), its starting time is 6, and its ending time is 9 (note that another earliest arrival path from node 1 to node 3 in the same interval is the one consisting of the temporal edges (1,5,7), (5,4,8), and (4, 3, 9)).

Temporal reachability cones and neighborhood functions
Given a temporal graph G = (V, E), a node u ∈ V, and a time interval I = [t α , t ω ], in this paper we are interested in efficiently computing the number of pairs of nodes (u, v) such that v can be reached from u in the interval I. To this aim, we give the following definition.Definition 2. Given a temporal graph G = (V, E), a node u, and a time interval I = [t α , t ω ], the temporal reachability cone of u in the interval I is defined as C(u, I) = {u} ∪ {v ∈ V : P (u, v, I) = ∅}.
For example, by referring to the temporal graph of Figure 2, the left part of Figure 4 shows the temporal reachability cone of node 1 in the interval [1,5]: in this case, all nodes can be reached by node 1.The right part of figure, instead, shows the temporal reachability cone of node 1 in the interval [9,12]: in this case, all nodes can be reached by node 1 apart from node 4. It is also easy to verify that the temporal reachability cone of node 1 in the interval [2,5] contains only node 1, since there are no temporal edges leaving it in this time interval.[1,5] and includes all nodes, while the second one (right) is in the interval [9,12] and includes all nodes but node 4.

Computing temporal neighborhood functions through reverse temporal cones
As we already said, in this paper we consider the left-to-right order for reading the stream of temporal edges, which corresponds to reading the edges in non-decreasing order with respect to their appearing times.Definition 4. Given a temporal graph G = (V, E) and a time instant t ∈ T (G), the predecessor pred(t) of t in G is the maximum time instant t ∈ T (G) such that t < t (if t does not exist, then we set pred(t) = ⊥).
In order to compute the temporal neighborhood function of a temporal graph G in a given time interval [t α , t ω ], we first introduce the following definition.Definition 5. Given a temporal graph G = (V, E), a node u, and a time interval I = [t α , t ω ], the reverse temporal cone of u in the interval I is defined as R(u, I) = {v ∈ V : P (v, u, I) = ∅}.
In other words, the reverse temporal cone of u in the interval I contains all nodes v that can reach u in I (hence, if v ∈ R(u, I), then u ∈ C(v, I)).The advantage of referring to reverse temporal cones is that these cones can be easily computed by using the following dynamic programming algorithm (observe that, for any t ∈ T (G), C(u, [t, t]) is the neighborhood of u at time t, that is, the set of nodes v such that (u, v, t) ∈ E).
The base case simply states that, if there is an edge (v, u, t α ), then v belongs to the reverse temporal cone of u in the interval [t α , t α ].The recursive step, instead, says that, if there is an edge (v, u, t), then all nodes that could reach v before t can now also reach u at time t.For example, by referring to the temporal graph of Figure 2 (in which all edges are bidirectional), the following table shows the evolution of the reverse temporal cones R(u, [1,6]) according to the above algorithm (until all nodes are reachable from all other nodes).
) the first time an edge appearing at time t is scanned.Since the size of the intermediate reverse temporal cones can be linear in the number n of nodes in the graph, the time complexity of this algorithm is O(nm).Moreover, if we want to maintain all the intermediate reverse temporal cones, then the space complexity of the algorithm is O(n 2 m).However, if we just want to compute the final reverse temporal cones, then the space complexity can be reduced to O(n 2 ).Observe that, if the edges are bidirectional, then we can easily modify this algorithm by simply considering twice each edge (v, u, t): once as (v, u, t), and the other as (u, v, t).
Once we have computed the reverse temporal cones in a specific interval I, we can easily compute N (I).Indeed, we have that, for each pair of nodes u and v, (u, v) belongs to N (I) if and only if u ∈ R(v, I).This implies that Hence, the temporal neighborhood function can be computed by using the sizes of the reverse temporal cones.Remark.Note that the above dynamic programming approach does not seem to be applicable to the computation of temporal cones, as defined in the previous section.More precisely, it is not clear how to rewrite the recursive step of the algorithm, when referring to temporal cones.Indeed, in general it holds that

Approximating the size of the reverse temporal cones
As we have already observed, the time complexity of the dynamic programming algorithm for computing the reverse temporal cones is O(nm), where n = |V| and m = |E|.This complexity can turn out to be prohibitive when dealing with temporal graphs with thousands of nodes and millions of temporal edges.For this reason, in this section we propose ATNF, an algorithm for computing an approximation of the size of the reverse temporal cones, which is based on the application of the sketch techniques: these techniques allow us to represent the reverse temporal cone for each node u in a compressed approximated form of (almost) constant size k (typically O(log(n))).By using cone sketches in place of real cones, we will obtain a dynamic programming approximation algorithm whose time complexity is O(km) time.

Sketch operations
Given a set A, a sketch S(A) is a compressed form of representation of A of size O(k), with k ∈ N, providing the following operations.

INIT (S(A))
How a sketch S(A) for A is initialized.UPDATE (S(A), u) How a sketch S(A) for A is modified when an element u is added to A. UNION (S(A), S(B)) Given two sketches for A and B, provide a sketch for A ∪ B. SIZE (S(A)) Estimate the number of distinct elements of A.
The following two requirements are needed for sketches: (i) given two sketches S(A) and S(B) for any two sets A and B, S(A ∪ B) can be computed just by looking at S(A) and S(B), and (ii) the order in which the elements are added and adding any element twice does not affect the sketch.We assume that |A| > k > 1.

Bottom-k sketches
One of the most popular sketch techniques is the bottom-k technique, which works as follows.Given a mapping r : U → {1/n, 2/n, . . ., n/n} and a subset A of U, we denote as H k (A) the first k elements of A according to r (or

INIT (S(A)):
As an example, let U be the set of the 26 letters of the alphabet and let r : U → {1/26, 2/26, . . ., 26/26} be the following function: a b c d e f g h i j k l m n o p q r s t u v w x y z r Let A = {a, l, i, c, e}, B = {w, o, n, d, e, r, l, a}, and k = 3.Then S(A) is {a, i, c} and S(B) is {o, r, a}, as these letters are the three elements of A and B, respectively, having minimum r-values.Hence, The size of A ∪ B = {a, l, i, c, e, w, o, n, d, r}, which is equal to 10, can then be estimated as follows: It can be shown that the mean relative error of the sketch sizes with respect to the real sizes is bounded by 0.79/ (k − 2) and that if we choose k ∈ Θ log |U| 2 , the mean relative error is bounded by with high probability [10,11].

Applying sketches to reverse temporal cones
In the case of reverse temporal cones, the sets whose sizes we want to approximate are the reverse temporal cones at different time instants within a specific time interval I. Indeed, for each edge scanned by the dynamic programming algorithm described in the previous section, either we use the UPDATE operation in order to add a node to the sketch of a reverse temporal cone, or we use the UNION operation in order to compute the union of the sketches of two reverse temporal cones.Whenever all edges whose appearing time is in I have been read, we can use the SIZE operation in order to estimate, for each node u, the size of R(u, I).By linearity of expectation, the approximation performed by ATNF is unbiased and has relative error bounded by with high probability, whenever k ∈ Θ log n 2 .

Experimental results
This section is devoted to analyze the performance of ATNF on approximating N (I).We evaluate both the running time of ATNF and the quality of the approximation achieved, comparing our results with the ones provided by ETNF.Summarizing, the two algorithms we are going to compare are the following.
ATNF Our method for approximating the earliest arrival reverse temporal cones and hence N (I), as described in Section 3.1.As the quality of the approximation of ATNF and its running time depend on the value of k, we analyze the behaviour of ATNF varying k in the set {2, 4, 8, 16, 32, 64, 128}.ETNF The method in [9] described in Section 1 to compute exactly the earliest arrival temporal cones and hence N (I) for any I.

Implementation and Computing Platform
Our computing platform is a machine with Intel(R) Xeon(R) CPU E5-2620 v3 at 2.40GHz, 24 virtual cores, 128 Gb RAM, running Ubuntu Linux version 4.4.0-22-generic.The code has been written in Java, using Java 1.8, for both the competitors.

Dataset
In order to our experiments, we used the following graphs (other temporal networks will be considered in Section 5, in which we will perform a case study based on the directed public transport networks of 25 cities).
• COLLEGE: Private messages sent on an online social network at the University of California, Irvine.An edge (u, v, t) means that user u sent a private message to user v at time t [26,27].• ENRON: Emails between Enron employees 1999 and 2003.An edge (u, v, t) means that user u sent an email to user v at time t.• FACEBOOK-WALLPOST: A small subset of posts to other user's wall on Facebook.The nodes of the network are Facebook users, and each directed temporal edge represents one post, linking the users writing a post to the user whose wall the post is written on [28][29][30].• IMDB: Every node corresponds to an actor and two actors are connected by their collaboration in a movie, where the appearing time of an edge is the year of the movie.We will consider both the whole temporal collaboration graph and the graphs induced by the following genres: adventure, horror, thriller, crime, romance, action, comedy, drama (for each genre, we consider only the edges corresponding to movies classified in that genre) [31].• ROLLERNET: Opportunistic sighting of Bluetooth devices by groups of rollerbladers carrying Intel iMotes during a roller tour.The 62 iMotes performed neighborhood scans every 15 second [32,33].• TOPOLOGY: The nodes are autonomous systems and the edges are connections between autonomous systems.The appearing time of an edge is the time-point of the corresponding connection [29,30,34].• TWITTER: Tweets about the migrant crisis of 2015 [35,36].A directed edge (u, v, t) means that user u retweeted a tweet of user v at time t.• WIKI-ELECTION: The network of users from the English Wikipedia that voted for and against each other in admin elections.Nodes represent individual users, and edges represent votes.Each edge is annotated with the date of the vote [29,30,37].
The dimensions of the above temporal graphs are summarized in Table 1, where we report for each graph its number of nodes and its number of edges, and whether the graph is directed or not.For a given network G, let t α and t ω denote the minimum and maximum appearing time, respectively, of the temporal edges included in G.We have then considered different intervals I = [t α , t β ], for increasing values of t β with t β ∈ T (G), where T (G) is the time horizon of G (that is, the union of all the appearing times of its temporal edges).Both ETNF and ATNF compute (or approximate) the number of pairs of nodes u and v such that, starting from u not before time t α , we can reach v within time t β .
As a result, we get an exact cumulative frequency distribution running ETNF and its approximation running ATNF.Section 4.1 is devoted to measure the quality of this approximation, while Section 4.2 shows the running times.

Quality of the Approximation
In order to evaluate the quality of the approximation, for each I = [t α , t β ] we have compared the approximation |N (I)| provided by ATNF with respect to the exact value |N (I)| provided by ETNF, using the RELATIVE ERROR.We have hence considered the MEAN RELATIVE ERROR (in short, MRE) among all the intervals I = [t α , t β ] for all values of t β in T (G).More formally, the MRE is defined as follows: µ, and the standard deviation, denoted as σ, over the ten experiments (note that both µ and σ are 0 for the ROLLERNET network, whenever k ≥ 64, since the graph has less than 64 nodes and hence ATNF turns out to always compute the exact values).
In general, both µ and σ consistently decrease while increasing k.For k = 2, 4, and 8, the average MRE appears to be quite large: for k = 2, the MRE can be up to 50%, and for k = 8 the MRE can be close to 20%.By increasing k to 16 we get an average MRE consistently smaller than 17%, which further reduces with k = 32 and k = 64, where in both the cases the average MRE is very often below 8%.Finally, for k = 128, the average MRE appears to be always smaller than 5.3%.For the sake of completeness, looking at the values of σ, we can observe how the variability of the experiments is more controlled when k increases.
Note that, in the table, the IMDB-ALL and TWITTER networks are missing since, according to our estimate and as shown in the next section, ETNF would have taken more than one week for the first and more than 190 days for the second to complete.Hence, for these two graphs, no quality comparison was possible.

Running Time and Time Comparison with respect to ETNF
Increasing the value of k, the running time of ATNF increases consistently (as also the quality of the approximation).In the case of bigger graphs, the running time of ATNF is orders of magnitude smaller with respect to the one of ETNF (rightmost column).We have highlighted this improvement in Table 4, where we have shown the ratio between the running time of ATNF and the one of ETNF.
Concerning the smaller graphs, like ROLLERNET and ENRON, which have respectively 63 and 150 nodes, we have observed that ATNF turns to be exact if k is set to a value greater than this number of nodes.In the case of ROLLERNET, it is interesting to note that the dynamic programming algorithm (which, hence, is exact for k = 64 or 128) is strictly faster than the BFS approach implemented by ETNF.On the other hand, as in the case of ENRON, the running time of ATNF, although not exact, can be sometimes worse than the one of ETNF whenever the graph has few nodes.
In the general case, running ATNF with k = 128, and thus getting an average MRE below 5.3% (as we have seen in the previous section), the running time of ATNF is very often below 0.3% the running time of ETNF.This improvement is even more striking for bigger graphs, where the running time of ATNF further reduces to 0.1%.For the two biggest graphs, i.e.IMDB-ALL and TWITTER, we were not able to run ETNF, due to the large amount of time required by the method.For this reason, we have estimated its running time (reported with † in Table 3).The estimation is based on the fact that ETNF runs n single-source procedures, each one from a different source node.It is possible to estimate the average running time µ of a single-source procedure, sampling Θ( logn 2 ) sources and computing the average time µ.It is easy to show that this is an unbiased estimator and that, by using the Hoeffding's inequality, |µ − µ| is bounded with high probability by • r, where r is an upper bound on the maximum time needed by a single-source procedure, e.g. the time for visiting all the edges.By multiplying µ by n, we get an estimation of the running time of ETNF.We have verified experimentally that the order of magnitude reported by our estimation is consistent with the actual time used by ETNF for the smaller graphs.As a  3. Comparing these running times with our estimates (last two rows of Table 4), ATNF turns out to be four orders of magnitude faster than ETNF.

Case Study: Comparison of 25 public transport networks
In this section, we will show how the ATNF algorithm for computing the approximation of the temporal neighborhood functions can be applied in order to perform an exhaustive analysis of some reachability properties of several temporal graphs in a very efficient way.In particular, we will make use of a recently published collection of 25 cities' public transport networks [14].This collection is available in multiple formats including the temporal edge list for a specific day, which is the format we use here.The list of the 25 cities is summarized in Table 5, where, for each city, we provide the number of stops (that is, the number of vertices of the temporal graph), the number of temporal edges, the day in which the data were collected, and the radius R around the city's central point that should cover all the continuous and dense parts of the city and its public transport network [14].We have grouped the 25 cities into five groups according to the vale of R. In particular, group i contains all the cities such that 10 Our goal is to analyze the reachability efficacy of each public transport network.To this aim, we compute the (approximate) value of the temporal neighborhood functions corresponding to different time intervals of the day in which the data were collected.In particular, we divide the interval between 6am and 9pm into 30, 15, and 10 intervals of length 30, 60, and 90 minutes, respectively.For each interval I = [t α , t ω ], we compute the approximate value of the temporal neighborhood function |N (I)| (normalized with respect to the number of all possible pairs of nodes), by using the ATNF algorithm.In the following, we will represent the distributions of these values by reachability diagrams such as the one shown in Figure 5, which refers to the city of Adelaide.
Observe that, in the case of transport networks, going from one stop to another, i.e. following an edge, takes some time.Therefore, in our dataset, the temporal edge list contains the starting time < l a t e x i t s h a 1 _ b a s e 6 4 = " g 3 S H 5 q 5 4 z S z z H z t Y T f P P s e and the arrival time a e of each temporal edge e between two nodes u and v.In order to apply our approach, we have used the following transformation.For each edge e in the dataset from u to v with starting time s e and arrival time a e , with a e − s e > 1, we have replaced e with a pair of two edges: one from u to z e with appearing time s e , and the other from z e to v with appearing time a e − 1, where z e is a new dummy node.For the edges e from u to v in the dataset such that a e − s e = 1, we replace e by an edge from u to v with appearing time s e .It is easy to show that there is a one-to-one correspondence between the feasible journeys in the original dataset and the sequences induced by the non-dummy nodes in the temporal paths of the resulting temporal graph.Moreover, it is worth remarking that, when computing the temporal neighborhood function in our transformed temporal graph, we have to focus only on non-dummy nodes.To this aim, we have slightly modified our approach 2 to exclude the dummy nodes from the sketches of our reverse temporal cones and, hence, not to count them in the final estimation of the temporal neighborhood function.
In the left column of Figure 6, we show, for each city group, the reachability diagrams of all cities in the group in the case in which we consider intervals of 60 minutes (these diagrams have been obtained by running 20 times the approximation algorithm with k = 64, and by taking the average values).As can be seen, in the case of the first and the last group, the two cities included in the group behave quite similarly, even if the city of Sydney seems to be more efficient in correspondence with the two peaks of its diagram at 8am and at 5pm (which might be considered as the rush hours of the day).In the other three groups, instead, there is clearly a city more efficient than the others of the group (that is, Luxembourg in Group 2, Berlin in Group 3, and Adelaide in Group 4).In each of these groups, the existence of a second more efficient city (that is, Palermo, Winnipeg, and Paris) is also evident.While in Group 2, two cities (that is, Rome and Rennes) fight for the third position, in Group 3 all the other cities are basically equivalent.
Note that we are not evaluating the whole quality of a public transport service, which of course depends on many other factors, such as robustness, safety, reliability, and customer service; instead we are only comparing different public transport networks in terms of their temporal neighborhood functions during a day.In particular, the reachability diagram of a public transport network should only be interpreted as an indicator of the percentage of the pairs of nodes which are connected in 2 In the base case in Section 3, it is enough to exclude u from R(u, [t α , t α ]), whenever u is a dummy node.Reachable pairs/all pairs < l a t e x i t s h a 1 _ b a s e 6 4 = " S t r F y j v K c y L 3 2 I j B 3 g j + M / i K L E 0 = " > A A A H Z H i c p V X d j t t E F H Z b 0 p S l w J a K K x A a s a q E k B v b a T e s h C J V d B F I I F S q 7 r Z S H K 3 G 4 x N n l P n r z L i b Y P l N e B p u e Q J e g A v e g D u O H X e 1 m 7 R I u x 1 p M k f n z P l 8 / u Z L Z g R 3 P o 7 / u n b 9 x n u 9 m / 1 b 7 + 9 8 c P v D j z 7 e v f P J s      Reachable pairs/all pairs < l a t e x i t s h a 1 _ b a s e 6 4 = " H p T / i w z I U 8 J g f j w 0 8 c p C D U H a q R 8 = " > A A A E 7 3 i c n V P N j t R G E D Y 7 k w B L E h Z y J I d W V i g I m b E 9 y w 4 r I U s I i B Q p Q s r f L k j j 0 a r d r p l p T f / R 3 Q 5 r L D 9 H b i j X v A W v w a P k l m r P L J p l O a W l c p f r t + u r q t I I 7 n y a f r i y M x h + 8 e X V a 9 d 3 b 3 z 1 9 T c 3 9 2    Reachable pairs/all pairs < l a t e x i t s h a 1 _ b a s e 6 4 = " / U c P v w P 8

C e H h k h l 3 I o g W B e f O E b X H Y T N m z S e g o g P P W V p e E y 0 i A P d 9 T T 1 d K h F G 2 / e e Z A + j c S n m c h a l B 5 6 o Y w
x / L y 2 3 2 t e u 3 7 h 5 q 3 P 7 z t 1 7 9 Reachable pairs/all pairs < l a t e x i t s h a 1 _ b a s e 6 4 = "  < l a t e x i t s h a 1 _ b a s e 6 4 = " 8 C 4 t a 5 m q m g p s U 7 3 H 0 p g L l y r l  Reachable pairs/all pairs < l a t e x i t s h a 1 _ b a s e 6 4 = " q 5 j M 5 W U N v l F p s j P      each interval, with respect to the total number of pairs of nodes.It is also worth emphasizing the fact in order to use these diagrams for two public transport networks of two cities, we must implicitly assume that the nodes of a network are uniformly spread in the circle of radius R corresponding to the network itself.It clearly does not make sense to compare a transportation network in which nodes are concentrated in a small sector of the circle with one in which are uniformly spread, since this would turn out into a greater efficiency of the first network with respect to the second one.Unfortunately, we are not able to verify that nodes are uniformly spread in the circle, and we can only assume that this requirement is satisfied.One could suspect that results similar to the ones shown in the left column of Figure 6 could be obtained by simply considering the temporal density of the original transport networks.For each interval I = [t α , t ω ], let m(I) denote the number of edges from u to v with both starting and arrival time in I.We then define the temporal density in the interval I as δ(I) = m(I) n , where n denotes the number of nodes of the network.In the right column of Figure 6, we show, for each city group, the density diagrams of all cities included in the group.As it can be seen, in Group 1, 2, 3 and 5, the first city is the same as in the reachability diagrams.In Group 4, instead, Paris seems to have the densest public transport network, even if it is not the first city in terms of reachability cone sizes.Moreover, while the reachability and the density diagrams of Group 1 and 5 seem to be quite consistent, this is not the case for Groups 2 and 3.For instance, in Group 2, Rome is denser than Palermo, but Palermo outperforms Rome in terms of reachability, while in Group 3 this phenomenon happens in the case of Prague and Winnipeg.In other words, the reachability diagram seems to provide some information that is not explicitly given by the density diagram.It is also worth observing how several cities present two peaks in their reachability and density diagrams which we can assume correspond to the two rush hours of the day.This seems to indicate that these cities, in correspondence of these rush hours, increase the number of connections and provide a better service in terms of the number of pairs of connected nodes.
In Figure 7, instead, we compare the reachability diagrams of the five groups obtained with time intervals of length 30 and 90 minutes, respectively (these diagrams should be also compared to the reachability diagrams shown in the left column of Figure 6 which are obtained with time intervals of length 60 minutes).It is worth observing that in the second and in the third group considering shorter time intervals makes the city with the better diagram perform even better compared to the others, and it basically nullifies the differences among the other cities.When larger time intervals are analyzed, it seems that the only effect we get is that the reachability diagrams are systematically shifted up.This is quite reasonable, since we can expect that, in a time interval of one hour and half, a significant percentage of all possible pairs of nodes are now connected.It is anyway worth noting that, in the case of the three cities in Group 4 and of the two cities in Group 5, even with time intervals of 90 minutes, < l a t e x i t s h a 1 _ b a s e 6 4 = " p K i A P L d l k 2 t q e i n I l n 0 o X N X T w j o = " > A A A I j 3 i c n Z V b j 9 t E F M f T F r r B F G h B P P F i 0 a 1 U k J W d + 0 U o q M B L e Q O J X q T d q J o 4 k 6 w V x w n 2 p N 3 F y g f l g Q / C G / 9 x n N D b U x 0 5 t s / c z v n / z p m Z b s q i C Y T 8 f e P m r Y 8 + v n 0 y / C T 5 9 M 5 n n 3 9 x 9 9 6 X T 5 v 1 t s 7 9 k 3 x d r u v n U 9 f 4 s q j 8 k 1 C E 0 j / f 1 N 6 t p q V / N l 3 + E t u f v f R 1 U 6 y r P 8 L 1 x k 9 W b l E V 8 y J 3 A a Y X 9 2 6 / f D A t X b 7 M V q 5 e j o O / C l k a / 9 P u + / S n 0 + R i 5 u c X 7 q p o 8 H T b M r w q Z u G y V f l q 9 0 7 T p S

Figure 1 .
Figure 1.An example of temporal graph with 5 nodes and 5 temporal edges (left), and the corresponding "non-temporal" graph in which edge temporal labels have been removed (right).

Figure 2 .
Figure 2.An example of temporal graph with 5 nodes and 12 temporal edges (labels on edges represent appearing time), and the corresponding link stream representation (labels on the left represent nodes, labels above represent time instants).

Figure 3 .
Figure3.The earliest arrival path from node 1 to node 3 in the temporal graph of Figure2in the time interval[1,12] (left) and in the time interval[2,12] (right).

Figure 4 .
Figure 4. Two temporal reachability cones of node 1: the first one (left) is in the interval[1,5] and includes all nodes, while the second one (right) is in the interval[9,12] and includes all nodes but node 4.

Figure 5 .
Figure 5.The reachability diagram of the public network of the city of Adelaide, in which the interval between 6am and 9pm has been split into 15 intervals of one hour each.
all pairs < l a t e x i t s h a 1 _ b a s e 6 4 = " 3 T D U 1 l U t e x i t s h a 1 _ b a s e 6 4 = " r u k 9 b c 5 9 m J t e e j l C c t x 6 a D 2 y H l u e 9 d x a s t a s L W v P I r X j 2 p f a 1 9 r 3 + r f 6 z / q v + u 9 h 6 L W x U c 4 D 6 9 y o / / k H G b n G e g = = < / l a t e x i t > 6am 8am 10am 12pm 2pm 4pm 6pm j 4 0 j s 2 G H I v 4 y J n 3 q 5 i e Q r S G u / F 3 s 7 3 P + z v e a G a e a f P D 3 / 0j v Z + O n h + u L P / 5 H D n 4 O k L D 3 n J c u t R I c 4 Y t x x Z j E A a b g s v 0 S r P + r 4 X 5 S F u t J 6 d 2 e q b / 4 N o f s m p h k s G m m i Q C i 0 s Q w 1 F D l e C S 5 H P I A 3 x H C f L I D X u m 6 t D p q T S g l l F 5 e L i u G S E G V r T q X o v x C t B J W 5 L u Y r K R c A v G 5 9 K V 5 w Z P E 0 g r x C L V u V C5 e b 9 S C 8 Z 4 y u Q n J 0 j H M 2 z T 7 n b 3 J 8 u e d J e T y 7 Z X 9 w C L n n W n D O X H M x 3 s 0 s O m 9 2 C L U b F J U e t 1 y 4 5 a d b G d F m v 2 e T B u f h 2 3 j x b 3 / B 7 f l P I s h C 0 w o b T l r P 1 u 7 t z D w h T a U Y 1 N / g 8 W P 9 3 F L 6 C d 9 l a F k 7 6 v e C r X v / n / s a j 7 1 o m O 8 7 n z j 3 n C y d w H j q P n K f O g X P s s M 6 3 n a i T d n 7 r / t r 9 v f t H 9 8 / 5 0 O v X 2 j m f O R d K 9 6 / / A I b B f S Q = < / l a t e x i t > t e x i t s h a 1 _ b a s e 6 4 = " l 1 I 6 c 4 S E D L 9 Z 8 J + l R e f U L / b e p 6 E = " > A A A H a n i c p V V N b 9 t G E G X S K k r d j z j J q U g P i 7 o B i o I R S S V W D A Q C A t s F A r S H t K j j A J J g D J d D a q H 9 6 u 6 y l k r w z / T X 9 N p j / 0 M P v f b W J U U X s p w U c L L A i q O d n b f z 3 g 6 H q e b M u j j + 8 8 b N D z 7 s 3 e r f / m j n 4 0 8 + / e z O 7 t 1 7 r 6 w q D c U T q r g y r 1 O w y J n E E 8 c c x 9 f a I I i U 4 2 m 6 O G r 8 p 7 + g s U z J n 9 x K 4 0 x A I V n O K D i / d H a 3 9 2 w q 8 Z w q I U B m 1 V S D m 2 e l E K t 6 k s y q 6 M T 6 0 E g z X t L o W N F S o H Q 2 c i i 0 M s A L A 3 p u I 1 U 6 X b p o L 3 m 0 t h 4 9 H s X x w C 1 d v T P N M J / C k l n / h J K 7 c 5 a 5 e Z U k V F z 1 z Z E V c 1 e N B v u t V x e 5 5 s pZ i 6 5 C z 2 B F m r 3 R w L o V x 3 F l K X A k S v L 1 e u 0 j U i y Y r B x b / K o Z d a X B e t L u G s e D / d m U g y n w Y l M b M l l x S J G P q x 8 R 6 B y 8 Y k Q D 8 3 y B 8 7 V V h x w L l B n R y o 4 9 O T R E K u P m B M G 6 c O k Y X b Q Q d l z F 4 Q h E e O B n E j c / Q y 3 C Z j 7 x c + T n g R Z 1 u P o v g n Q 0 I k 8 z k q V I P X S u j A A X 5 W y J W f g G h 7 9 X y p q L H O / X Y c s s I y v S Q N p x D t z i b I e Q K W R Z o 9 t k h Z y r 8 1 C A W Y z t z y U Y / K b 9 Q 5 R u b t 5 2 E o 6 T 0 C r O s n p G X C P B R g l U P s / c E 6 3 r Z 5 v A K Q e 6 e E 9 c 4 8 t 1 C 9 Z 4 0 i 3 o 9 e G U y R D K 5 R Z g Y R D l e + V J Q f p U D W w B b y p 7 b c w M n V H s f z S 9 N u L c 1 x + T C / Z 2 + t e G 9 H 0 n V W + 9 o H d U U x s o S r x C v M R 3 T P K c S c k 0 F g 3 i w + n 6 R a 0 O 1 y U b k s O 2 x v y z K 4 6 Q H H W 3 G Z L j 9 R 2 E 5 E U n X U i + b x m H 5 G W b Z E h O L 8 B 3p g 1 s 2 y / W 5 m Z / O d v d i w d x O 8 h V I + m M v a A b Z 7 t f H C m h w T C r J P l u 9 I R 8 O 4 z J 8 S g m L a O z 3 X + m W d d k K Q d r J 0 m s 3 a w C 4 9 9 x j v 7 8 0 q L 2 h Q I F T r w p Q a C d V e 0 3 o C Y P / U r W t A o / p S P t 6 m Z E B c L a l U j 9 T t 9 M 5 n b b 1 y y + y T c p X X 4 w q 5 j 0 n R 0 l X R + U l 5 w 4 R Z o P C s m Y 7 0 v O 9 + G M A T X M 5 0 p 8 N z V A f b u 8 f E o j 3 S U W 1 U W T b 5 R M t n W 7 a r w a D p L H g + E P w 7 3 n h 5 2 m t 4 M H w Z f B 1 0 E S P A 2 e B y + C l 8 F J Q H u / 9 X 7 v / d H 7 + 9 Z f / X v 9 z / s P 1 l t v 3 u h i 7 g e X R v + r f w E A a 5 6 G < / l a t e x i t > t e x i t s h a 1 _ b a s e 6 4 = " g R w H q m R s K x f 5 a q s 1 y 2 w B m o J N 8 0Y = " > A A A K A n i c z V b L b t w 2 F F X S t J l x X 0 6 6 z I a I E 6 A o h J H k 1 K 6 B Q k D s 2 K 2 L u o 3 r x H G A m Y F B U V c a d i h S J a l 4 V E G 7 L v o t 3R X d 9 k e 6 6 I d 0 1 y u N H I w z h r c 2 A U o k L x / n n k P y M s o F N 9 b 3 / 7 l 1 + 7 0 7 7 3 9 w t 9 d f + f C j j z / 5 d P X e / V d G F Z r B M V N C 6 d c R N S C 4 h G P L r Y D X u Q a a R Q J O o u m z x n 7 y B r T h S r 6 0 Z Q 7 j j K a S J 5 x R i 0 2 n 9 + 7 + + z g S l E 3 d j O p p a G F m X d J 8 S V t / t P 1 o Z R R D M q I z b v B P C 2 H P e G w n V R C w r F 6 y T Y C n E 1 t t D j Z a a 5 4 m u V D W G L A V I I q S N H 2 9 g b G l g L A y j A o g S o p 5 e 4 0 j I k i 5 r C y f / p p z Z g s N 9 b D t F f q D j f F I U J 3 C e a d 2 y L A U N A I R V j 8 W W Q S a q I R A n I L x 5 N u 6 V D G Y 2 h W Q g o x J r k y o C o s m q b S d E K D G u j P L 2 b S d y I S V 7 2 7 S z N 3 C H P j N Z z 3 P 3 C Z / i X k T 8 1 a e 1 W 7 5 d g T p n P H Q 2 f N V E 6 U z a r 2 E z y B 2 L z G g Q o w 3 k o Q b t d v 6 F 5 O S N F O a M K H C w H i F k B G N 4 4 a 9 Y Q l C q L O 5 Q O a X g m r 4 o q 0 Q l T c a m o 7 I M H C N E j y u x 8 R S l L / y j g 0 K 7 + V c F M z b V a z I Q F r j v d j b / u a 7 g z 0 v U j P v 5 P n R 9 9 7 L v R 8 O n x 9 t H 3 x 7 t H 2 4 / 8 J D d v L C e l S I U 8 Y t R y 5 j k I b b 0 k u 1 K v I n v o d e J 0 j b w M 5 s / f U i 0 o W d d E O A a j w X y z g 1 y t K i v G 5 8 S s d A i 9 k y w l Q D y B v E J K M S y d R 0 G e n i 7 r x m k D F Y r f j V + / K a I U 7 w i u F y y q 9 U / J o x Y t S J 1 F W n 5 k b s y F z T t I D L t C 7 g R t B 4 x q X k O a Q d x N E 8 A F U 7 8 8 v T J T v t 5 Y T / 7 h J w y b P u k L l k d 7 6 T X b L f 7 R e X H L S q u O S w 9 d s l J 9 3 0 G D S b a d t o O C 8 u R s / T 1 T V / 4 L e J L B e C r r D m d O l 0 9 f 7 u 3 6 7 S E H b B i + r 8 4 d L w F 7 z L 1 n L h 1 f o g e D J Y / 2 l 9 7 e l O x 2 T P e e A 8 d D 5 3 A u c r 5 6 m z 7 x w 6 x w 7 r 7 f Z + 7 p n e 7 / 3 f + n / 0 / + z / N e 9 6 + 1 Y 3 5 j P n Q u r / / T / r E H 9 B < / l a t e x i t > pairs/all pairs < l a t e x i t s h a 1 _ b a s e 6 4 = " U 4 Q f l 7 y e + 8 e W C m O 1 n s m j e 7 m u o Z H a + b h e Z 5 5 N x t j + e / D D Z e / C w r + l O 9 E X 0 d X Q 7 y q J 7 0 Y P o + + h p d B T R w c + D X w a / D n 4 f / j b 8 c / j 3 8 J + t 6 e V L v c / N 6 M y 5 s v M v + t 3 U b A = = < / l a t e x i t > t e x i t s h a 1 _ b a s e 6 4 = " R G y w x M S / 8 9 C 0 l Y K e y y I + m U 4 m f 9 y d a Z 5 m 9 h Z b 3 u L X w f m F u 9 V m N 5 B X r r n X P e m B 5 1 h N r 1 X p l b V k d i z Q e N j Y b B 4 2 4 y Z p f m l + b 3 y a m F y / U P n e s U 6 v 5 / T e 7 F w P D < / l a t e x i t > 6am 8am 10am 12pm 2pm 4pm 6pm 8pm 0 t e x i t s h a 1 _ b a s e 6 4 = " / z u g 8 y G u V d R L S 6 z h c p K 6 + A m n a s a h 1 s b r 9 7 s b F m e G F g n e w d v r Q 9 b u / t 7 B x s 7 r w 8 2 9 r c P L Q 1 L n C o L M 9 Y l V F E N o g 9 c U p V Z Y S L S e M W 2 I m C e b j w O

Figure 6 .
Figure 6.The of the reachability diagrams (left) and of the density diagrams (right) of the public transport networks.
e 6 / 9 r f 2 9 / W M e u r y 0 y H n o n T v t n 3 8 B D S i k Q w = = < / l a t e x i t > 8 / d k Y 8 y I 7 j z c f z n r d t 3 P u p t 9 e 9 + v P 3 J p 5 9 9 / s X O v S 9 P n S 4 t g x O m h b a v M u p A c A U n n n s B r 4 w F K j M B L 7 P 5 k y b + 8 g 1 Y x 7 V 6 4 Z c G J p I W i k 8 5 o x 5 d Z / d 6 T 8 e m m B q h r w f D X 4 c 7 j 7 5 b 1 f R W c D / 4 P P g y S I J v g k f B j 8 G T 4 C h g v V e 9 P 3 p / 9 v 6 5 + X f / d v 9 e / 5 P u 6 v V r K 5 + P g w u r / 9 l / B C 2 c R g = = < / l a t e x i t > t e x i t s h a 1 _ b a s e 6 4 = " V 9 T C Y g A n V c L C N l z 4 3 j 8 0 T J n u j + d e v 2 Z 3 c G d 4 f 3 7 m 9 9 / s W X X z 3 Y f v j 1 K 6 t K w + C U K a H M 6 5 R a E F z C q e N O w G t t g B a p g L N 0 c d z E z 9 6 C s V z J l 2 6 l Y V r Q X P I Z Z 9 S h 6 / z h 4 G W i 8 5 k W y l k L r m K q 0 N T F 0 W i3 3 k o k X O B 7 Q W V W J e i d Z 2 V R r O p J N K 2 C U 4 u Y g e a i Z M G J Y m U B 0 t n A Q a G V o S I 3 V M 9 t o E q n S x f s R E 8 6 6 0 l 0 G I Y j t 3 S I n c E s o U t u 8 U l L 4 S 5 4 5 u Z V F L H i e m w O P J + 7 6 m C 0 3 0 bX 0 w W k t i L N 3 G B k 3 U p A X F l G B R A l R e e v c U U K O Z e V 4 4 s / N G e u N F B P 2 l l x O N q f J o K a H C 4 n t U s m K 0 F T E H H 1 O 1 A 2 p 1 h K o i l H v l S I z q p 9 A T n I j G h l Y y Q H h k h l 3 J w A t c 5 f O s 4 W L Y S N q 9 A / o I V / i C M K m 9 t Y F 3 4 z 9 n A c 4 D j U R e 2 v / l t B e h o B 0 g x k W a Q I P V O m o C 6 Y 8 S V k / n s C e O C M N y c c 7 9 d + y y w j K 9 J A 2 n h G h Y X p F i E J z b K m b p M V C K E u / I K a R W z f l N T A D + 0 L U b p p C d u X M I 5 8 q w T P 6 i l x T Q n W W q D C P G d I t K 5 / W g d O B W W L T 8 Q 1 2 M c b s A Z J t 6 A 3 h 1 M m A 1 o u N w B z A y A / K U 9 G J a Z q 6 A b w e m V v j J m B M 4 r / T 0 1 v j D j H / u N y w T 9 M / 8 a Q K E i p + u A B f W Q 1 t a F 5 C d e I l / C R S V 5 w K b m G v E F 8 n H Q f a n X U t a x P j t o e w 2 f f H D 4 5 7 k / T J y f d G f j k e V 8 6 n / z a M v b J i z Z J n 5 x d g m 8 l D W y r F 5 2 5 r i / n 2 z v h K G w v c t 2 I e m P H 6 6 / z 7 W + P G + E 1 3 C p J f j n Y I z + P Q 3 K y G 5 K W 0 f n 2 u y T r R Z Y J a u 0 k C r W b V t T g N y 4 A 9 y 8 t a G w U m s M E T U k L s N O q / T n U 5 D F 6 s k Y q c E h H W u / 6 i o o W 1 q 6 K F G e i m M z t Z q x x v i 8 2 K d 3 s c F p x i c o O k n U b z U p B n C L N n 4 Z k H H X J o Q 5 n n D L D M V e C a m o o Q 7 m 8 u k t T u i s s q k u Rb y o Z b d b t u v F q P I p 2 R + P f x j v P j v q a 3 v O + 8 b 7 z v v c i 7 6 n 3 z H v u v f B O P T b 4 c / D 3 4 J + 7 9 4 e D 4 Y / D 8 X C v m 3 r 7 V r / m k X f l G s b / A n U / p u s = < / l a t e x i t > t e x i t s h a 1 _ b a s e 6 4 = " e 8 q H e M Q H C O a A u 4 N F 8 6 t m G e k n 1 L k = " > A A A H a X i c p V V d b + N E F P U u Z L O U r 5 Z 9 Q c D D i A g J I W 9 s h 7 Y U U K R V W 6 S V 4 G F B d L t S E l X X 4 2 t n l P n a m T F N s P x j + D W 8 8 s p v 4 I F n 3 h g 7 L k r T X a T u j j T x z d y 5 Z + 4 5 c 3 2 d a s 6 s i + M / 7 9 x 9 4 8 3 e v f 7 9 t 3 b e f u f d 9 9 7 f 3 f v g q V W l o X h G F V f m W Q o W O Z N 4 5 p j j + E w b B J F y P E 8 X J 4 3 / / B c 0 l i n 5 s 1 t p n A k o J M s Z B e e X L v Z 6 3 0 w l X l I l B M i s m m p w 8 6 w U Y lV P k l k V n V k f G m n G S x q d K l o K l M 5 G D o V W B n h h Q M 9 t p E q n S x c N k o d r 6 + H B f h w P 3 d L V O 9 M M 8 y k s m f V P K L m 7 Z J m b V 0 l C x U 3 f H F k x d 9 X h 8 K D 1 6 i L X X D l r 0 V X o G a x I s z c a W r f i O K 4 s B Y 5 E S b 5 e r 3 1 E i g W T l W O L X z W j r j R Y T 9 p d 4 3 h 4 M J t y M A V e b W p D J i s O K f J x 9 R M C n Y N X j G h g n i 9 w v r b q k G O B M i N a 2 b E n h 4 Z I Z d y c I F g X L h 2 j i x b C j q s 4 P A Q R H v m Z xM 3 P S I u w m f t + H v p 5 p E U d r v 6 L I B 2 N y N O M Z C l S D 5 0 r I 8 B F O V t i F r 7 A 4 e + V s u Y i x w d 1 2 D L L y I o 0 k H a c A 7 c 4 2 y F k C l n W 6 D Z Z I e f q M h R g F m P 7 v A S D X 7 R / i N L N z d t O w n E S W s V Z V s + I a y T Y K I H K 5 5 l 7 o n X 9 7 S Z w y o E u X h P X + H L d g j W e d A t 6 e z h l M o R y u Q V Y G E T 5 W n l S k D 5 V A 1 v A m 8 r e G j N D Z x T 7 H 0 1 v j T j 3 9 c f k g r 2 c / q 0h f d 9 J 1 U s v 6 B X V 1 A a K E m 8 Q L / E V k 7 x k U j K N R Y M 4 X b + n 1 f G 6 Y k N y 3 J a Y f 3 a 1 E Z K T 7 j J D c r q + g p A 8 7 p Q L y Q 8 t 4 Z A 8 a X M M y f k V 9s 6 0 g W 3 b x d r c b C 8 X u 4 N 4 G L e D 3 D S S z h g E 3 b j Y / e R E C Q 2 G W S X J 9 4 f 7 5 L t R T E 6 / j k l L 6 G L 3 n 2 n W 9 V j K w d p J E m s 3 q 8 D 4 V 5 y j P 7 + 0 q H 2 d Q I E T b 0 o Q a G d V + w m o y W d + J W s 6 h Z / S k X Z 1 M 6 I C Y e 1 K p H 6 n 7 y V z u + 1 r F l / k m 5 Q u P 5 p V T P r G j p K u D 8 p L T p w i z f e E Z M y 3 J e f b c M a A G u Z z J b 6 Z G q C + W 1 4 / p Z H u G o v q q s c 3 S i b b u t 0 0 n o 6 G y Z f D 0 Y + j w a P j T t P 7 w c f B p 8 H n Q R J 8 F T w K H g d P g r O A 9 n 7 r / d 7 7 o / f 3 v b / 6 e / 0 P + x + t t 9 6 9 0 8 U 8 C K 6 N / u B f u f G e W g = = < / l a t e x i t >

e b l 6 G
P T P f 0 V j m Z J P 3 V r j T M B C s j m j 4 L z o 6 N r g 2 0I v 5 p o r Z y 2 6 h i q h w e X Z e L c d F R J f + W 8 B s m o K L 1 1 W t R D r d p r N m u T Q e s x E M 1 7 T 5 J G i t U D p b O J Q a G W A L w z o p U 1 U 7 X T t k p 3 s 9 o a 7 n R 2 k 6 d g d O 4 9 d 4 b y A Y 2 b 9 G 2 r u X r H K L Z s s o + K 8 b o l s s X T N / n i v 0 5 4 M F 3 1 q a x J s k 7 F 1 a 4 5 5 Y y l w J E r y j b z 1 H i U u m G w c W / 2 m G X W 1 w X b a W e X p e G 9 W c D A L / G D U u U z X H E r k e f M L A l 2 C L y X R w H y + w P m G a 2 O O C 5 Q V 0 c r m P j k 0 R C r j l g T B u v j Y M br q I G z e p P E + i P j A U 5 a G x 0 S L O N A d T / u e D r R o 4 / W / H q R P I / F p J r I W p Y e e K y P A

/ l a t e x i
t e x i t s h a 1 _ b a s e 6 4 = " D b z B r 6 k M k j c X m v Z 7 R n 3 6 e J j W y Z g

7 m 2 4 e 3 u 7
a T p 2 x 8 5 j V z g v 4 J h Z / 4 a a u 1 e s c s s m y 6 g 4 r1 s i W y x d s z / e 6 7 Q n w 0 W f 2 p o E 2 2 R s 3 Z p j 3 l g K H I m S f C N v v U e J C y Y b x 1 a / a k Z d b b C d d l Z 5 O t 6 b F R z M A t 8 b d S 7 T N Y c S e d 7 8 i E C X 4 E t J N D C f L 3 C + 4 d q Y 4 w J l R b S y u U 8 O D Z H K u C V Bs C 4 + d o y u O g i b N 2 m 8 D y I + 8 J S l 4 T H R I g 6 0 6 2 n f 0 4 E W b b z + 1 4 P 0 m w a 2 x y 7 P Y a c G r b k p 8 q H 2 j 9 6 0 E U e L k K e i 6 R 5 s x L T 7 s 6 y K6 p l L Q h H D 3 i h W m 7 c u z J D H 5 b a X d L o K 8 7 8 2 K 3 e z l y c 5 u O k r 7 Q y 4 y 2 Z r Z j d b n Z O f 7 J 2 H I L X d a k R c H D 8 i z c U q e 7 q W k f 9 D J z s e i W g 8 0 E 9 S 5 S Z Y a P 2 2 p R T w F Y P 7 a g U E E 6 R w m y C o q w U 3 b f h E 7 c g 8 l V W g L k v K k l 2 5 6 t F Q 6 1 8 g S L b F x C / e 5 L g i / p J v U f n Y 4 b b n C L Q L F V o l m t S B e k 7 D V p O I 4 A x 5 n v u K U W Y 5 v J T i 5 l j I c z f N Z A n T n q m j P F i o g m X 2 O 2 0 X m e D z K 9 k b j 1 + P d x 7 + s M b 0 a 3 Y 1 + i H 6 K s u h h 9 Dh 6 H r 2 K j i I 2 + H H w c n A 8 q I f v h n 8 O / x r + v T L d u r T 2 + S 4 6 d 4 b / / A 9 5 e a + k < / l a t e x i t > 6am 8am 10am 12pm 2pm 4pm 6pm 8pm t e x i t s h a 1 _ b a s e 6 4 = " 1 e L u V X I 0 9

Figure 7 .
Figure 7.The comparison of the reachability diagrams with 30 intervals of 30 minutes each (left), and with 10 intervals of 90 minutes each (right) of the 25 public transport networks.
< l a t e x i t s h a 1 _ b a s e 6 4 = " 8 + X J 1 0 h S e j u / P v r w 1 A V 0 q a A d X l I = " > A A A I u n i c p Z V b j 9 t E F M f T A t 1 g b i 0 8 8 m L R r V S Q l Z 3 7 R R C p w A M 8 A q I X a T e q J s 4 k a 8 V x g j 2 h G 6 x 8 G D 4 W H 4 U 3 / u M 4 o b c n y G 7 i 8 Z k z c y 6 / M 3 O m m 7 J o A i F / 3 b r 9 z r v v 3 T k b v p 9 8 8 O F H H 3 9 y 9 9 6 n T 5 r 1 t s 7 9 4 3 x d r u t n U 9 f 4 s q j 8 4 1 C E 0 j / b 1 N 6 t p q V / O l 1 + H + e f / u 7 r p l h X v 4 b d x k 9 W b l E V 8 y J 3 A a L n 9 + 7 8 + W B a u n y Z r V y 9 H A d / E 7 I 0 / q b d + / m 3 5 8 nV z M + v 3 E 3 R 4 O m 2 Z X h R z M J 1 q / L V / o 2 p a 1 8 s r k O r R r K b 3 S z m m 3 I d m s a H 1 s O J X R p 1 L 0 Z N 2 J V + 3 D a 5 K 3 2 6 r s q D f I 8 V U 7 8 o q j Y U y z 8 2 R R 6 2 t d 9 f d l p j M l K T q 9 L V C 3 9 U 6 p Z c d h G P f 6 h 9 t U b E 2 a 5 0 U 1 + O 2 1 + 8 y 6 8 d J O n G F X V z 4 c r y M N p n N 6 H I l 5 1 a M 2 5 J p t w q o x Q / f L P K z G a 1 z 3 Y n h b R 3 9 A K B X F T b 1 d T X 6 X x d r 1 y 4 m B c 3 f p a 9 Z Q L J z 4 u Y 7 b H c Z 5 3 v s 3 S X x i 2 b 8 d y V j c 9 K v / D V 7 L i 3 C + P 2 I R n x j I y o / H K f u S q / X t f j F 7 4 J + 0 m S p l d u N o t J v K x h r 2 P S / L Z 1 t f + q e 0 n X m 4 i x 6 Z M 5 p l m z L o v Z f p K G G H 2 b k J S M C O f E c i G J E I Z Z L U V C o 1 R R S Q h h n F N N m C A 8 Y V E q h V V K C a W k l p x L m / A o F Y I R R a z g V l F t q E x E J 6 V E M m 0 E s 0 J Q p Z h M Z G d O W y 6 5 M l J o o q S C P d V p S 2 6 I U Z x R Q R l l L N G d l F N 4 Y I 1 V T E v C V W K 6 L a z k D P s S R g T s c Z H Y g 0 G m r T C c G C p k D C W h X Y A C 4 Q l p j F a a W k I g p o d Y p G L Y W y F K a m L c h x C p s F I Q w g m l i l K e 0 E O M D P 5 a I y y F z 4 o l t I u R G c O 5 0 f B H W C I U t P d f v 4 x l U X t f / X c w k l p B p S V w X J z A w B E Y o s x o I q 2 V 6 k h G W w Y m B L l l T J m e D N e C w 0 k j F K d Q E T 0 a R K K 4 J S D E s T P E B z S G E W K Q u o i e G n 0 k o 5 B 5 S Z h W i i A P P R n G g Z I w q p i l M Q M H N A J Z 0 5 p y r Q 0 A 0 C M Y C 1 K w Z V E 5 h t p / w S j N O A q K g J r k + g S G o + 6 0 Y A Y l x Z g 5 g W E M K U Z G q D S a m h M Y B Q Q K G A E / v p 3 Q w D v U A n I F N F q 9 j m Z a b v 3 / O T J x S 6 E Z 0 V x R 1 a N B c c E 7 y y g l q E 5 7 J G O E A E p J U C w o z C M Z g + M l t d E 4 S h L S H g w O I f g x l K R A t R 6 5 Y D t L m e U c h Q n i B 6 I G G u p p U g 5 Z T g F / Z H R 0 D W o m / g H J 3 o u S k E O N 4 g h W H I 8 M o y b C E C I W D y M n 8 B o Y i x C j B G i d E 5 g L E J E q e M I W I R 4 A q O 4 M U T h a 4 y M y e j B 4 J / h Z K G C U R / 2 x E U Y A l O C w Q a N Z x p c r g 7 X X n u + H H N 2 n u G h R P d A m Z / j 5 o 9 z 3 Z V + G L 7 c A p 7 f v Y 9 d u 0 / 6 5 o D 2 g / u D / v P s 2 A 2 e 3 / 3 7 a r b O t y t f h b x 0 T X N J y S Z M W l f j O i 4 9 L G 0 b v 0 H v c w t / i W H l V r 6 Z t F 1 P 3 a c P I J n F W x 3 f K q S d 9 O U V r V s 1 z W 4 1 h S b u / e v m 9 b k o f N v c 5 T b M z a Q t s 0 2 + C o / G J p v y z S s 0 9 i g 0 1 m B F h L Q F G e F y + s C v q b oZ b X L A 9 r 4 K 1 Z i k l 6 J o j 1 2 3 J g z + n q G 3 h w 8 Y S P K R + x n d v / R d 3 3 2 h o P P B 1 8 M H g 7 o Q A 8 e D X 4 c / D R 4 P M j P h m e j M 3 3 2 a P j N c D o s h s u D 6 u 1 b / Z r P B q 9 8 h u E f q Z R Y 6 w = = < / l a t e x i t >

Figure 8 .
Figure 8.The reachability diagrams of three sample cities with different values of k.

Table 1 .
Number of nodes and edges of each (undirected or directed) graph.

Table 2 .
In this table, for each k, we have reported the average MRE, denoted as 4,8,16,32, 64, 128}, we have repeated the experiments ten times and we have reported our results in

Table 2 .
Comparing the quality of the approximation of ATNF with respect to ETNF for approximating |N (I)|.For each k, the average µ and the standard deviation σ of the MRE over ten experiments are reported.

Table 3
reports the average running time (in milliseconds) of ATNF and of ETNF, respectively, to get the approximate and the exact value of |N (I)| for each

Table 3 .
Running times (milliseconds) of ATNF for different values of k compared with the running time of ETNF.The numbers marked with † are estimated due to the computation limits of ETNF.

Table 4 .
Ratio between the running time of ATNF and the one of ETNF for different values of k (lower is better).The values for the graphs marked with † are estimated due to the computational limits of ETNF.

Table 5 .
[14]25 cities included in the public transport network dataset[14].The Stops column indicates the number of nodes, the Edges column the number of temporal edges, the Day column the day in which the data were collected, the R column the city's radius (in km), and the G column the group the city belongs to.
result, applying our estimation for the biggest graphs, 1 ETNF requires 622 185 seconds (more than a week) for IMDB-ALL and 25 771 840 seconds (more than 198 days) for TWITTER.On the other hand, setting k = 128, ATNF is able to approximate |N (I)| in only 112 seconds for the former graphs and in 71 minutes for the latter one, as shown in Table source procedure requires on the average 1.18 seconds for IMDB-ALL and 7.33 seconds for TWITTER, with relative standard deviation respectively of 17.49% and 10.87%.