The ε -Approximation of the Time-Dependent Shortest Path Problem Solution for All Departure Times

: In this paper, the shortest paths search for all departure times (proﬁle search) are discussed. This problem is called a time-dependent shortest path problem (TDSP) and is suitable for time-dependent travel-time analysis. Particularly, this paper deals with the ε -approximation of proﬁle search computation. The proposed algorithms are based on a label correcting modiﬁcation of Dijkstra’s algorithm (LCA). The main idea of the algorithm is to simplify the arrival function after every relaxation step so that the maximum relative error is maintained. When the maximum relative error is 0.001, the proposed solution saves more than 97% of breakpoints and 80% of time compared to the exact version of LCA. Furthermore, the runtime can be improved by other 15% to 40% using heuristic splitting of the original departure time interval to several subintervals. The algorithms we developed can be used as a precomputation step in other routing algorithms or for some travel time analysis.


Introduction
Computing the arrival function from a source node to all other nodes is important for a lot of transportation applications.More formally, given a directed graph G = (V, E), a source node s ∈ V, we want to know the travel time between the source node s and all other nodes for every departure time (in some literature called a travel time profile).This problem is generally called the time-dependent shortest path problem (TDSP).The main principle is that the arrival time t u at the node u is used as the argument of the arrival time function f corresponding with the edge that origins at u.
The common approach is to use a piecewise linear function as a realization of the arrival function.Let us have two consecutive edges (e.g, the edges (s, u) and (u, d) in Figure 1a); then, the arrival time at the node d is the value of the arrival function f ud in the arrival time f su (t d ) at the node u, where t d is the departure time at the node s.In the exact case, the combination f 2 ( f 1 (t)) of two piecewise linear functions f 1 , f 2 with | f 1 |, | f 2 | linear pieces is also a piecewise linear function with up to | f 1 | + | f 2 | linear pieces [1].It means that the arrival function at the end of the path with n edges can have up to ∑ n i=1 | f i | linear pieces.For example, the path across the Pilsen city has around 100 edges.If every arrival function on the path has 24 linear pieces, the resulting arrival function has 2400 liner pieces.It can be seen that the computational time and memory requirements strongly increase with the length of the paths.
The problem with an increase in the number of linear pieces can be solved using the ε-approximation of the resulting arrival function.This approach reduces the number of linear pieces, and thus reduces memory requirements as well as computation time.Our proposed approach is based on the so-called label correcting modification of Dijkstra's algorithm [2].The main idea is to perform a simplification of the arrival functions during the computation with a suitable maximum absolute error so that the relative error ε is maintained.Further, this technique can be accelerated by the heuristic.The idea is that the original departure time interval is split into subintervals, and thus the number of edge relaxations is reduced.These subintervals are independent too, so the parallelization or distribution of computation is possible and effective.

Road Network
Let G = (V, E) be a directed graph that represents a road network, where V is a set of nodes and E is a set of edges.Each edge (u, v) ∈ E has an arrival function (AF) f : R → R ≥0 that for the given departure time at u returns the arrival time at v. Alternatively, we can define a travel time function (TTF) that returns the time needed to cross the edge.A relationship between the TTF g and the corresponding AF f is defined as It is assumed that every AF f fulfills the first-in-first-out (FIFO) property: ∀t 1 < t 2 : f (t 1 ) ≤ f (t 2 ) and the departure time t d must be smaller then the arrival time t a (the travel time must be positive).AFs are implemented as piecewise linear functions.
In Figure 1a you can see AFs ( f su , f sv , f ud and f vd ) for every edge in a small example graph with four nodes V = {s, u, v, d} and four edges E = {(s, u), (s, v), (u, d), (v, d)}.
The points of AFs are called breakpoints.The number of breakpoints of AF f can be written as | f |.The following operation must be defined for two AFs:

•
There are two consecutive edges (s, u) and (u, d) with AFs f su , f ud .The operation combination f ud * f su : t → f ud ( f su (t)) represents AF from s to d.In Figure 1b there are AFs as results of the combination along the paths (s, u, d) (solid red line) and (s, v, d) (solid blue line).

•
There are two parallel paths p 1 , p 2 from s to d with AFs f 1 sd , f 2 sd .The operation minimum min( f 1 sd , f 2 sd ) : t → min{ f 1 sd , f 2 sd } represents the earliest AF from s to d.In Figure 1a p 1 = (s, u, d) and p 2 = (s, v, d).In Figure 1c you can see this earliest AF as a result of the operation minimum (green line).

Problem Definition
More precisely, TDSP can be defined as minimizing the travel time over the set P s,d of all paths in G from the source node s to the destination node d: where f d is the function of the earliest arrival time (minimal AF) from s to d and f p is AF of the path p ∈ P s,d .This paper deals with one-to-all problem.The input data are the graph G, AF f uv for every edge (u, v) ∈ G and the source node s.The output is the set F of the earliest AFs from the source node s to all other nodes u: F = { f u |u ∈ V \ {s}}.

Approximation
In this paper the ε-approximation of AF f is understood as the ε-approximation of TTF f (t) − Let us present some useful theorems about the approximation that were derived for a use in the proposed algorithm.Theorem 1.Let g be an ε-approximation of TTF g.Then it holds that Proof.If the function g is substituted by its extreme values (1 − ε)g, (1 + ε)g, the expression is still valid.
Theorem 2. Let f u be an ε-approximation of AF f u and ] be the maximum slope of AF f uv , approx( f , δ) be a function that simplifies the AF f with the maximum absolute error δ ≥ 0. Then Proof.The maximum absolute error of f v is εg v and the maximum absolute error of the operation f uv * f u is αεg u .Then, the result of the combination can be simplified with the maximum absolute error: Theorem 2 can be also formulated in a local form for a given departure time.
In Figure 1b there are dotted lines that represent the approximation of AFs.In Figure 1d

Related Work
There are two groups of methods that compute an approximation of the AF.The first methods use forward and backward probes.The forward probe computes the arrival time at the node d with the given departure time at the node s.The backward probe solves the inverse problem.The arrival time at d is given and we want to know the departure time at s.These probes can be computed using the well-known Dijkstra's algorithm [3].
These methods recognize two types of breakpoints.The V points represent points that are created as images of the breakpoints that lie on the edge arrival functions { f uv |(u, v) ∈ E}.The X points are created as an intersection of two AF in the minimum operation.It can be proven that the AF between two consecutive V points is concave or a line segment [1] (see example in Figure 1c).The algorithms described in [1,4] use this concavity.First, the V points are computed using one backward probe and two forward probes (more in [3]), and then the approximation of AF between the V points is determined.The main problem of this approach is that the computation of V points requires 3 The second group of methods uses a label correcting modification of the Dijkstra's algorithm (LCA) (Algorithm 1).The modifications of the Dijkstra's algorithm are:

•
The node labels are AFs from s.

•
The key of the priority queue is the minimum of AF (min f ).

•
The relaxation of the edge (u, v) is performed using LCA has time complexity O(|V||E|) [2].However, a real road network is far from the worst case, because it is close to planar graph and the cost of edges is the travel time; i.e., is bounded by physics rules, and thus the shortest path is near to the direct line.This technique is widely used; see e.g, [5][6][7].First LCA is performed in the exact form and after that the resulting arrival functions F are simplified and used for further computation (e.g, some query algorithm).Some guarantees about the error of AF are presented in [5], but these guarantees give only a maximum error dependent on the degree of approximation.
Algorithm 1: LCA in the exact form.
1 PQ = minimum priority queue where key is min f In Algorithm 1 the initialization is performed in the lines 1-4.All node labels (AFs) are set to infinity in the line 2. The travel time at s is set to zero (line 3) and the node label at s ( f s ) is added to the priority queue (PQ) (line 4).In the line 6 the algorithm takes the node on the top of PQ and relaxes all edges that lead from this node.The relaxation is represented by the lines 8-11.The line 8 performs the combination of the node label at u ( f u ) and the edge AF ( f uv ).The condition in the line 9 checks for updates.Updates of the label at the node v are performed in the line 10.Line 11 puts the node v to the PQ.
There are a lot of other variants of LCA.Those variants differ in the type of data structure-a queue (FIFO, first-in-first-out) [8], a priority queue (as in our case) [5,7,9], a stack (LIFO, last-in-first-out) [10].The algorithms also differ in the insertion strategy into the data structure.Some innovative insertion strategies are presented in [11,12].The choice of LCA variant depends on the type of the target graph.
The main task is to develop an algorithm which solves TDSP with the given maximum relative error and is effective for a real road network.It follows that we focused on the ε-approximation of the LCA.

Proposed Algorithms
This section describes two algorithms solving the ε-approximation of TDSP based on LCA (Algorithm 1).

ε-LCA Algorithm
The basic idea of the first proposed algorithm (ε-LCA) is that the simplification of AF is performed after every edge relaxation (the operation combination).The degree of the simplification is directed by Theorem 2.
The ε-LCA computes AF with the maximum relative error ε assuming that where α uv is the maximum slope of AF f uv .The slope α uv must be bounded because Theorem 2 is used in the ε-LCA and the theorem needs this assumption.The ε-LCA differs from the exact LCA only in the computation of f v .The AF f v in the line 8 in Algorithm 1 is simplified with the maximum absolute error δ (according to Theorem 2), so the line 8 is replaced by two lines where α(t) is the maximum slope of f uv in the interval The simplification was performed using Douglas-Peucker algorithm (DP) or Imai and Iri algorithm (II) [13].
The main problem of ε-LCA is that if the assumption (2) is not complied, δ < 0 and the algorithm cannot ensure the given relative error ε.This occurs when the maximum slope α uv is too large.This issue is resolved using the second algorithm that is described in the upcoming section.

ε-LCA-BS Algorithm
The second proposed algorithm (ε-LCA-BS) is based on backsearch.It has no limitations for the slope α.The pseudo-code of the ε-LCA-BS is in Algorithm 2. The basic idea is that if the algorithm finds an edge (u, v) where α is too big in some departure time interval [t m , t n ] (δ < 0), it determines f v in [t m , t n ] again with a higher accuracy (lines 11-15 of the Algorithm 2).The algorithm returns back to the point such that the edge (u, v) can be reached from this point with sufficient precision using the exact LCA.
When the label at the node v (AF f v ) is updated (the condition at the line 16 is fulfilled), the edge (u, v) is added to the predecessor list pred(v) of the node v (lines 17-20).The set of predecessors form a graph R = (V R , E R ) (red color in Figure 2).We assume that the graph R is acyclic.In general, the graph R may not be acyclic, but in real case it is very unlikely.We want to find nodes w such that if the exact LCA is performed from these nodes w, the AF f v is an ε-approximation.The nodes w have to satisfy the following inequalities in the interval [t m , t n ]: where P vw is the set of all paths from v to w in the graph R (the paths must contain the edge (u, v)) and α e represents the maximum slope of AF f e corresponding with the edge e.The set W is the set of all nodes w that meet the condition (4) and there is a path p ∈ P vw that does not contain any other node from the set W (W is the smallest possible).
1 PQ = minimum priority queue where key is min f // according Equation ( 3) These nodes w can be found using a topological ordering of V R (Algorithm 3).The node labels α b correspond to the left side of the inequalities (4), so the algorithm finds maximal paths in R. First, the labels are set to negative infinity (the line 2) and the label at u is set to α uv (the line 3).The lines 4-5 ensure a topological ordering.If the condition (4) in line 6 is fulfilled, b is added to the W. The lines 9-12 ensure updating of the node labels.The part of f v in the interval [t m , t n ] is substituted by a more accurate result of the exact LCA with the initial priority queue PQ that is created by adding all f w ∈ { f w |w ∈ W} (the lines 13-16).
In Figure 2 there is an example of backsearch.The black color represents the original graph G and the red color represents the acyclic graph R. The edge (u, v) violates the condition (2).Then the algorithm starts backSearch procedure and finds the set W (red nodes) using the graph R.
As you can see, for the intervals [a, b] and [c, d] the condition at line 16 in Algorithm 2 is false.Thus, the node will not be added to the PQ and K is reduced.

time of day
The question is how to split the departure time interval so that the computation is the most effective.In Figure 4 there is a visualization of speed profiles for the testing dataset.The x-axis represents a time of day and the y-axis displays the average speed in time on edges.In the visualization, all speed profiles from the tested dataset are rendered in various colors.As you can see, the fluctuation in the night time (the first part and the last part) is small.There are two main concepts for splitting:

•
Split the origin interval into equal subintervals.

•
Split the origin interval into inhomogeneous subintervals (e.g., longer at night and shorter by day).
Which of these concepts is better strongly depends on the TTFs shape, as further presented in experiments.
The splitting into intervals enables a very simple parallelization.Due to the independence in the departure time intervals, we can set one interval equal to one thread.

Experiments
The real road network with real speed profiles that were computed from GPS tracks was used for testing.These data represent part of Paris (Figure 5).Compared to [9], the speed profiles' preprocessing was improved so the measured results are a little bit different.The algorithms were implemented using Scala programming language (OpenJDK 1.9, Debian 11).The testing was performed on a computer with Intel (R) Core (TM) i5-8250U CPU with 1.60GHz and 16 GB RAM.

The -LCA-BS Testing
Only the -LCA-BS was tested because -LCA is only a special case of -LCA-BS.The maximum allowed relative error ε was set to 10 −2 , 10 −3 , 10 −4 and 10 −5 .
Four graphs (G1, G2, G3 and G4) were created to show the performance of the developed algorithms.Every edge in the graphs has AF with 24 linear pieces.Every graph represents different classes of roads.In Table 1 there are numbers of edges for each graph and performance results of -LCA-BS: the relative time t r and the relative number of breakpoints bps related to the exact version of LCA.The same results can be seen in Figure 6.The results in Table 1 show that the maximum relative error 10 −4 brings only a small improvement and the maximum relative error 10 −5 is slower than the exact version of the LCA, but this accuracy is too big for a real use.The maximum relative error from 10 −2 to 10 −3 seems to be a good compromise between accuracy and performance.
In Table 2 there are absolute values of parameters measured for the maximal relative error 10 −3 .The column bps represents the number of breakpoints in the resulting AFs.The results show that the breakpoints savings are significant.That means the ε-LCA-BS saves a lot of memory.Let us assume a path that takes 1 h; then, the relative error 0.1 % implicates the absolute error 3.6 s.In this case the epsilon approximation saves more than 97% of memory and 80% of time.In case that ε is too small, the algorithm can run slower than the exact version, because the simplification takes too much time.
The disadvantage of the ε-LCA-BS is that it is sensitive to values of the maximum slope of AFs.If AFs have too big slope then the algorithm performs too many calls of the backSearch procedure and thereby makes the computation too slow.In practice, the functions usually have small slopes.When it is certain that input data do not violate the condition (2), the algorithm is more suitable.
Furthermore, the computation of V breakpoints was implemented.This computation is the first step of the all algorithms presented in [1,3,4,16].As mentioned above, the step needs 3 ∑ (u,v)∈E | f uv | static shortest path computation.The V breakpoints determination was performed on G1 and takes 21 min.

Splitting Tests
The first test is focused on equal subintervals.The original departure time interval was divided into 2-12 subintervals.The two datasets (G1 and G2) were used and the maximum relative error was set to 10 −2 , 10 −3 , and 10 −4 .The more effective simplification method by Imai and Iri was set for all tests in this section.The test was performed only in serial (one thread was used only).In Figure 7 there are results of the test.The chart shows the dependence between the speed-up and the number of subintervals.The speed-up is defined as: speed-up(N) = runtime for 1 interval runtime for N subintervals .
As you can see, the speed-up is between 1.1 and 1.8 and is very variable, but never under 1.For our testing data, 4-10 equal subintervals are a good choice.In case when the number of subintervals is too big (in our case more than 10), the overhead costs override the benefits of splitting.The inhomogeneous subintervals were also tested.Several methods for the departure time interval splitting were tried, but the runtime was only a few percent better than the splitting into equal subintervals.The conclusion is that the equal subintervals, although also more simple, work better.
The last test demonstrates the suitability of parallelization.The dataset G1 with ε = 0.001 was used.The algorithm was performed with three settings: without splitting into subintervals, with four subintervals in the serial case (1 thread) and with four subintervals in four threads.The code was performed on one computer with multi-core processor.The following runtimes were measured (Table 3).The testing shows that the parallelization is suitable so it is recommended.

Conclusions
Two algorithms for ε-approximation of TDSP and their heuristic improvement for real data were presented.The algorithms significantly reduce the memory use.When the maximum relative error is a sufficiently large value (in our case 10 −3 ), the algorithms save the computational time, too.From this point of view, the algorithms are suitable for precomputing the TTFs for the next use (e.g., time-dependent distance oracles and time-dependent contraction hierarchies).
It has been shown that the splitting into departure time subintervals can further reduce the runtime and this splitting is suitable for parallelization or distribution because the subproblems are independent.
In future work, it would be useful to utilize some technique for graph size reduction based on static shortest path search.The goal is to remove the edges that will certainly not be used.In a one-to-one problem case the developed algorithms can be combined with other speed-up techniques that reduce the graph (e.g., time-dependent-sampling [14]).
you can see the ε-approximation f d of the earliest AF f d from s to d (solid green line) and its upper bound f ↑ d and lower bound f ↓ d (green dashed lines).

Figure 1 .
Figure 1.Example of calculation of the arrival function from node s to d (t d -departure time, t a -arrival time).

Figure 4 .
Figure 4. Speed profiles visualization.The colors represent the individual speed profiles.

Figure 6 .
Figure 6.The relative number of breakpoints and the relative time related to exact LCA (II-Imai and Iri, DP-Douglas-Peucker).

Figure 7 .
Figure 7. Speed-up dependence on the number of subintervals N in serial case (1 thread).

Table 1 .
Results of testing -LCA-BS (t r -the relative time related to the exact version of LCA, bps-the relative number of breakpoints, ε-the maximum allowed relative error).