A Unified Approach to Spatial Proximity Query Processing in Dynamic Spatial Networks

Nearest neighbor (NN) and range (RN) queries are basic query types in spatial databases. In this study, we refer to collections of NN and RN queries as spatial proximity (SP) queries. At peak times, location-based services (LBS) need to quickly process SP queries that arrive simultaneously. Timely processing can be achieved by increasing the number of LBS servers; however, this also increases service costs. Existing solutions evaluate SP queries sequentially; thus, such solutions involve unnecessary distance calculations. This study proposes a unified batch algorithm (UBA) that can effectively process SP queries in dynamic spatial networks. With the proposed UBA, the distance between two points is indicated by the travel time on the shortest path connecting them. The shortest travel time changes frequently depending on traffic conditions. The goal of the proposed UBA is to avoid unnecessary distance calculations for nearby SP queries. Thus, the UBA clusters nearby SP queries and exploits shared distance calculations for query clusters. Extensive evaluations using real-world roadmaps demonstrated the superiority and scalability of UBA compared with state-of-the-art sequential solutions.


Introduction
This study investigates a unified batch approach to spatial proximity (SP) queries in dynamic spatial networks. In the investigated approach, the distance between two points is the travel time of the shortest path connecting them, and the shortest travel time frequently changes depending on traffic conditions, such as traffic volume and accidents. In this study, SP queries refer to a collection of nearest neighbor (NN) and range (RN) queries, which are basic query types in spatial databases. NN queries retrieve points of interest (POI), such as taxis and restaurants, closest to a query user [1,2], and RN queries retrieve POIs within a query distance [3][4][5]. Typically, location-based services (LBS), such as taxi-booking and ride-sharing services, use real-time spatial data to locate POIs close to the query user [6][7][8][9][10]. When multiple SP queries reach an LBS server simultaneously at peak times, if the SP queries are processed sequentially, it may not be possible to provide prompt responses to the query users. This difficulty can be addressed by increasing the number of LBS servers or by developing state-of-the-art algorithms based on "one-query-at-a-time processing" [3,[11][12][13][14][15] to process the SP queries quickly.
SP queries have many potential applications in dynamic spatial networks, such as ride-hailing and car parking facilities. For example, in 2020, the ride-hailing company Uber accomplished an average of 18.7 million trips per day [16], demonstrating the significance of scalable and efficient solutions to promptly match Uber cabs with passengers. Another example is real-time parking management, which helps drivers find a parking space close to their destination. Figure 1 shows two snapshots of SP queries in a dynamic spatial network, where a set Q of SP query points and a set P of data points are expressed as Q = {q NN 1 , q RN 2 , q NN 3 }  Figure 2 is a system diagram of the proposed UBA between query points and LBS server. Query points send their locations and query requests to the LBS server (step 1). The LBS server collects the requests from query points and forwards them to the UBA (step 2). The UBA first groups nearby query points into query clusters for shared computation (step 3). Then, UBA retrieves candidate data points for each query cluster to avoid unnecessary network traversals (step 4). UBA evaluates each query using the candidate data points for the query cluster (step 5) and returns query results to the LBS server (step 6). Finally, the LBS server provides the result to each query point (step 7).
All nearest neighbor (ANN) queries [17,18] are similar to SP queries. However, ANN queries assume that each query point q in Q only finds a single data point closest to q and, therefore does not consider RN queries. This study considers a highly dynamic situation in which both query and data points run freely within a dynamic spatial network [19][20][21]. The proposed UBA can effectively process SP queries in dynamic spatial networks. For simplicity, this study considers NN queries rather than kNN queries, which retrieve k data points closest to the query user for a positive integer k. However, UBA can easily be extended to process kNN queries.
The primary contributions of this study are summarized as follows.
• A unified batch processing algorithm, i.e., UBA, is proposed for the batch processing of SP queries in dynamic spatial networks. The performance of UBA highly depends on the distribution of query points. Thus, UBA clearly outperforms sequential algorithms when query points display a skewed distribution. Conversely, UBA shows similar performance to sequential algorithms when query points display a uniform distribution. • Clustering of SP queries and their shared computation are presented to avoid unnecessary distance computations. The correctness of UBA is proved using a lemma. Furthermore, a theoretical analysis is presented to establish the advantage of UBA over sequential algorithms, particularly when query points display a skewed distribution.
• An empirical study is conducted under various conditions to demonstrate the superiority and scalability of UBA compared with a sequential algorithm.
The remainder of this paper is organized as follows. Section 2 reviews related studies. Section 3 introduces the necessary preliminaries, including a definition of the notations and symbols used in this study. Section 4 explains how to cluster nearby SP queries into query clusters and presents the proposed UBA for SP queries in dynamic spatial networks. Section 5 presents an empirical study of UBA compared to a conventional algorithm under various conditions. Conclusions and suggestions for future work are presented in Section 6. | | ⋯ LBS server (1) Each query point q sends its location and request to LBS server.
(7) LBS server returns the response to the query point q.
(3) Grouping nearby query points into query clusters (4) Retrieving candidate data points for each query cluster (5) Evaluating each query using candidate data points UBA Query snapshot at time (2) Query request (6) Query response

Related Work
Researchers developed algorithms and index structures to evaluate spatial queries, including NN and RN queries for LBSs [6,[22][23][24][25]. When calculating the length of the shortest path between two points, spatial queries for dynamic spatial networks suffer from high computational cost because graph traversal is required at runtime. Therefore, numerous studies attempt to reduce the computational cost of the shortest path distance, to avoid unnecessary shortest-path computations [6,[22][23][24][25]. Incremental Euclidean restriction (IER) and incremental network expansion (INE) were developed for NN queries [3]. IER assumes that the shortest path between two points is larger than or equal to the Euclidean distance. INE explores the spatial network incrementally from the query point, as in Dijkstra's algorithm, and investigates the data points in the encountered sequence. Range network expansion (RNE), which is similar to INE, was also developed for RN queries [3]. The route overlay and association directory method, ROAD [12], hierarchically divides the spatial network and pre-calculates the length of the shortest path between the border vertices within each partition. The distance-browsing method, DisBrw [13], exploits the spatially induced linkage cognizance index, and retains the length of the shortest path between each pair of vertices. G-tree [15] hierarchically divides the spatial network and uses an assembly based approach to compute the length of the shortest path between two vertices. V-tree [14] iteratively divides the spatial network into sub-networks and identifies the border vertices of each subnetwork. Then, the V-tree maintains a list of data points closest to each border vertex to quickly evaluate the kNN queries. A scalable and in-memory kNN query processing method called SIMkNN [26] was developed to quickly evaluate snapshot kNN queries over moving objects in a spatial network. The existing methods described in [6,[22][23][24][25] are considered to be one-query-at-a-time processing algorithms because they aim to quickly evaluate a spatial query rather than a batch of spatial queries. This study is motivated by the observation that, with simple modifications, NN query processing algorithms can be applied to evaluate RN queries for spatial networks.
Multi-query optimization techniques were originally studied for relational database systems [27]. Their goal is to reduce the computational costs for a collection of queries that concurrently reach the database server by performing shared expressions once, materializing them temporarily, and then recycling them to evaluate other queries. Therefore, the subexpressions are typically evaluated once. These multi-query optimization techniques later expanded to involve query rewriting, query result caches, materialized views, and intermediate query results for relational database systems [28][29][30][31][32][33][34][35][36] and streaming processing systems [37][38][39]. Many applications involving high-load conditions have proven that batch processing algorithms can significantly reduce the query processing time for multiple simultaneous queries [19,[30][31][32][33][34][35][36][37][38][39][40][41][42][43]. Furthermore, multi-query optimization techniques have received significant attention in spatial databases. Several batch shortest path algorithms also exist [19,[40][41][42][43][44]. Furthermore, multi-query optimization techniques have received significant attention in spatial databases. Several batch shortest path algorithms [19,[40][41][42][43][44] have been developed to efficiently evaluate multiple shortest path queries in spatial networks. However, these batch shortest path algorithms cannot be directly used to evaluate SP queries because of their diverse problem definitions. Several cache strategies for query results have been developed to efficiently process batches of kNN queries in spatial networks [6]. These strategies exploit the cached results of adjacent recently computed queries to efficiently process a batch of kNN queries. However, cache strategies have clear limitations in dynamic spatial networks, as their results may be invalidated by frequent updates to the weight of the spatial segments and by the movement of query points or data points. Finally, Li et al. [45][46][47][48][49] developed a series of algorithms for processing large complex networks, such as social networks. Specifically, they considered the trust management system based on game theory [47], dynamic clustering for electronic commerce systems [45], identifiability for the community detection [49], an optimal estimation of low-rank factors [48], and the identification of overlapping communities [46]. This work differs from existing studies in several respects. First, UBA considers SP queries in dynamic spatial networks. Second, UBA avoids dispensable network traversal by clustering SP queries and performing batch processing. Third, UBA can easily be incorporated into one-query-at-a-time processing algorithms for spatial networks [3,12,13,15].

Preliminaries
This section defines the terms and notations that are used in this paper.
Definition 1 (NN query [1,11,18,22,25]). Given a query point q NN and a set of data points P, an NN query retrieves data point p NN closest to q NN such that dist(q NN , p NN ) ≤ dist(q NN , p) holds for ∀p NN ∈ Π(q NN ) and ∀p ∈ P−Π(q NN ). Definition 2 (RN query [3][4][5]). Given a positive integer r, a query point q RN , and a set of data points P, an RN query retrieves data points within query distance r to q RN such that dist(q RN , p RN ) ≤ r holds for ∀p RN ∈ Π(q RN ). Definition 3 (Spatial network [3,9,11,25,26,41,50,51]). A dynamic spatial network can be described as a dynamic weighted graph G = V, E, W , where V, E, and W indicate the vertex set, edge set, and edge distance matrix, respectively. An edge has a nonnegative weight, e.g., travel time, and changes its weight frequently.

Definition 4 (Intersection, intermediate, and terminal vertices).
In this study, vertices are categorized via their degree. In this study, vertices are categorized via their degree as follows: (1) if the degree of a vertex is greater than or equal to three, the vertex is an intersection vertex; (2) if the degree is two, the vertex is an intermediate vertex; (3) if the degree is one, the vertex is a terminal vertex. For example, v 2 and v 3 in Figure 3 are intersection vertices, v 5 and v 6 are intermediate vertices, and v 1 and v 4 are terminal vertices.

Definition 5 (Vertex sequence and segment).
A vertex sequence v l v l+1 . . . v m denotes a segment connecting two vertices v l and v m such that v l and v m are either an intersection vertex or a terminal vertex, and the other vertices in the segment, i.e., v l+1 , . . . , v m−1 , are intermediate vertices.
The length of a vertex sequence is the total weight of the edges in the vertex sequence. Parts of a vertex sequence are referred to as segments. By definition, a vertex sequence is also a segment. For example, Figure 3 has four vertex sequences v 1 v 2 , v 2 v 3 , v 3 v 4 , and v 2 v 5 v 6 v 3 . Examples of query segments in Figure 3 include v 2 v 5 v 6 , v 5 v 6 and v 3 v 6 v 5 . Table 1 summarizes the symbols and notations used in this study. Note that the query points are often used interchangeably to refer to SP queries. Figure 3 illustrates the difference between the distance and segment length between q 1 and q 2 in a spatial network. Here, the shortest path from q 1 to q 2 is q 1 → v 2 → v 3 → q 2 , whose distance dist(q 1 , q 2 ) is equal to eight. The segment connecting q 1 and q 2 (marked with a bold line) is q 1 v 5 v 6 q 2 , and its length len(q 1 v 5 v 6 q 2 ) is equal to 10. Set of data points Q Set of query points Q C and Q Query cluster and a set of query clusters, respectively B(Q C ) Set of border points for Q C Π(q NN ) Set of data points closest to query point q NN Π(q RN ) Set of data points within query distance r from a query point q RN P(qp) Set of data points in a segment qp dist(q, p) Length of the shortest path connecting points q and p len(qp) Length of the segment qp connecting points q and p v l v l+1 . .

Clustering Nearby SP Queries
Here, we consider five SP queries q NN 1 , q RN 2 , q NN 3 , q RN 4 , and q NN 5 in a spatial network ( Figure 4). Assume that the NN queries q NN 1 , q NN 3 , and q NN 5 find a data point closest to themselves and that the RN queries q RN 2 and q RN 4 find data points within query distance r (=4) to themselves.  Figure 5 shows an example of the two-step clustering method, which converts nearby query points into a query cluster. In the first step, query points in a vertex sequence are connected to a query segment ( Figure 5a). As a result, three query segments q NN 1 q RN 2 , q NN 3 , and q RN 4 q NN 5 are generated, where q NN 1 q RN 2 and q RN 4 q NN 5 connect two separate sets of query points, i.e., q NN 1 and q RN 2 , and q RN 4 and q NN 5 , respectively, in vertex sequences v 1 v 2 and v 1 v 5 , respectively. In the second step, adjacent query segments are grouped into a query cluster using joint vertices ( Figure 5b). The intersection vertex is referred to as a joint vertex when it is adjacent to greater than two query segments. As shown in Figure  are connected to a query cluster In other words, the five query points q NN 1 , q RN 2 , q NN 3 , q RN 4 , and q NN 5 are clustered into query cluster Q C . Note that Q C is represented by a set of query segments. Consequently, a set of query points Next, we define the border point of query cluster Q C . Any point at which Q C and its non-query cluster G − Q C meet is referred to as the border point of Q C . In this example, Q C has three border points, i.e., v 1 , v 2 , and v 5 , where Q C and its non-query cluster G − Q C meet. Note that sequential solutions should evaluate the five SP queries shown in Figure 4. The two-step clustering method enables UBA to evaluate the three SP queries at border points v 1 , v 2 , and v 5 rather than at query points q NN 1 , q RN 2 , q NN 3 , q RN 4 , and q NN 5 . Figure 6 illustrates the computation of the distance between query point q in query segment q i q j and data point p for the following cases: p / ∈ q i q j and p ∈ q i q j . As shown in Figure 6a, when data point p is outside query segment q i q j , i.e., p / ∈ q i q j , the distance from q to p is given as dist(q, p) = min{len(qq i ) + dist(q i , p), len(qq j ) + dist(q j , p)} because the shortest path between q and p is either q → q i → p or q → q j → p. As shown in Figure 6b, when p is inside q i q j , i.e., p ∈ q i q j , the distance is given as dist(q, p) = min{len(qp), len(qq i ) + dist(q i , p), len(qq j ) + dist(q j , p)} because the shortest path between q and p is governed by one of the following three cases: , , Figure 6. Computation of the distance between query point q in query segment q i q j and data point p:

Unified Batch Processing Algorithm for SP Queries
Algorithm 1 provides the key concept of UBA for the unified batch processing of SP queries in a spatial network. Here, the result set Π(Q) is initially set to an empty set (line 1). Then, the nearby query points are first grouped into query clusters (lines 2 and 3), as discussed in Section 4.1. A Cluster search then is executed for each query cluster Q C to perform batch processing of the SP queries in Q C , and its query result is saved to Π(Q C ) (line 6). Then, the query cluster result . When cluster_search (Algorithm 2) is performed for each query cluster in Q, UBA terminates by returning the query result Π(Q) (line 8).

Algorithm 1 UBA(Q, P)
Input: Q: collection of SP queries, P: collection of data points Output: Π(Q): collection of tuples of each SP query q in Q, and the query result for q, i.e., Π(Q)={ q, Π(q) |q ∈ Q} 1: Π(Q) ← ∅ // The result set Π(Q) is initially set to an empty set. 2: // Nearby query points are grouped into query clusters, as explained in Section 4.1. 3: Q ← cluster_points(Q) // A set Q of query points is changed into a set Q of query clusters. 4: // cluster_search function performs a batch processing of SP queries in Q C , as detailed in Algorithm 2. 5: for each query cluster Q C ∈ Q do 6: The result for query points in a query cluster Q C , i.e., Π(Q C ), is appended to Π(Q). 8: return Π(Q) // Π(Q) is returned after the cluster search for all query clusters in Q is executed.
Algorithm 2 describes the cluster search algorithm employed to answer SP queries in query cluster Q C . Here, cluster search performs batch execution for a query cluster to avoid dispensable network traversal. This algorithm runs in two steps. In the first step, the SP queries are evaluated at the border points of Q C rather than at the query points in Q C (lines 3-6). Note that an SP query is either an NN or RN query; thus, the type of spatial query must be determined, which is evaluated at a border point b. If a query cluster Q C includes only NN queries, an NN query is evaluated at the border point b, i.e., SPQ(b, Q C ) = Π(b NN ). Similarly, if Q C includes only RN queries, the SPQ(b, Q C ) function evaluates an RN query at border point b, i.e., SPQ(b, Q C ) = Π(b RN ). Finally, if Q C includes both NN and RN queries, the SPQ(b, Q C ) function evaluates the SP query that finds all the data points satisfying the NN or RN conditions at border point b, i.e., SPQ(b, Q C ) = Π(b NN )∪Π(b RN ). In the second step, a shared computation is performed for each query segment q i q j in Q C using the candidate data points obtained at the border points of Q C (lines 7-10). Here, each SP query in q i q j chooses qualified data points from the candidate data points in Π(b i )∪Π(b j )∪P(b i b j ), where it is assumed that query segment q i q j belongs to segment b i b j in Q C . When the segment_search (Algorithm 3) is performed for each query segment in Q C , the cluster_search algorithm (Algorithm 2) terminates by returning the query result Π(Q C ) (line 11).

Algorithm 2 cluster_search(Q C , P)
Input: Q C : query cluster, P: collection of data points Output: Π(Q C ): collection of tuples of each SP query q in Q C , and the query result for q, i.e., Π(Q C )={ q, Π(q) |q ∈ Q C } 1: // Note that B(Q C ) refers to a set of border points in Q C . 2: Π(Q C ) ← ∅, Π(B(Q C )) ← ∅ // Both Π(Q C ) and Π(B(Q C )) are initially set to an empty set. 3: // An SP query is evaluated at each border point b of Q C to retrieve candidate data points for Q C . 4: for each border point b ∈ B(Q C ) do 5: Π(b) ← SPQ(b, Q C ) // An SP query is evaluated at a border point b, and its result is saved to Π(b). 6: The query result at a border point b of Q C is appended to Π(B(Q C )). 7: // q i q j is assumed to belong to a segment b i b j in Q C . 8: for each query segment q i q j ∈ Q C do 9: // The result for a query segment q i q j , i.e., Π(q i q j ), is appended to Π(Q C ). 11: return return Π(Q C ) // cluster_search ends by returning the batch result Π(Q C ) for the SP queries in Q C .
Input: q i q j : query segment in Q C , Π(b i )∪Π(b j )∪P(b i b j ): collection of candidate data points of SP queries in q i q j Output: Π(q i q j ): collection of tuples of each query q in q i q j and the query result for q, i.e., Π(q i q j ) ={ q, Π(q) |q ∈ q i q j } 1: Π(q i q j ) ← ∅ // Π(q i q j ) is initially set to an empty set. 2: for each SP query q ∈ q i q j do 3: Π(q) ← ∅ // Π(q) is initially set to an empty set. 4: for each candidate data point p ∈ Π(b i )∪Π(b j )∪P(b i b j ) do 5: // Step 1: dist(q, p) is evaluated considering the two cases p/ ∈b i b j and p∈b i b j , which are shown in Figure 6. 6: if p is outside b i b j then 7: dist(q, p) ← min{len(qb i ) if q = q NN and dist(q, p) ≤ dist(q, p NN ) then 12: Π(q) ← Π(q) ∪ {p} − {p NN } // p replaces p NN that is the current NN of q so far. 13: else if q = q RN and dist(q, p) ≤ q.r then 14: Π(q) ← Π(q) ∪ {p} // If dist(q, p) ≤ q.r, p is simply appended to Π(q).

15:
Π(q i q j ) ← Π(q i q j ) ∪ Π(q) 16: return Π(q i q j ) // segment_search ends by returning the batch result Π(q i q j ) for the SP queries in q i q j . Algorithm 3 describes the segment search algorithm employed to answer the SP queries in a query segment Q C using the candidate data points in Π(b i )∪Π(b j )∪P(b i b j ). Here, the batch query result for q i q j , i.e., Π(q i q j ), is initially set to an empty set (line 1). The distance between a query point q in q i q j and a candidate data point p, i.e., dist(q, p) is then calculated (lines 5-9), as shown in Figure 6. When p is outside b i b j , i.e., p / ∈ b i b j , the distance from q to p is given as dist(q, p) = min{len(qb i ) + dist(b i , p), len(qb j ) + dist(b j , p)}. When p is inside b i b j , i.e., p ∈ b i b j , the distance from q to p is given as dist(q, p) = min{len(qp), len(qb i ) + dist(b i , p), len(qb j ) + dist(b j , p)}. If query point q is an NN query and candidate data point p is closer to q than the current NN p NN , then p is appended to Π(q) and p NN is removed from Π(q), i.e., Π(q) ← Π(q) ∪ {p} − {p NN } (lines [11][12]. Similarly, if query point q is an RN query and dist(q, p) is not greater than the query distance q.r, then p is simply appended to Π(q), i.e., Π(q) ← Π(q) ∪ {p}, where q.r is the query distance of q (lines [13][14]. The Segment_search algorithm (Algorithm 3) ends by returning the batch result Π(q i q j ) for q i q j (line 16). Lemma 1 proves the correctness of UBA, which means that each query point q in a query cluster Q c can retrieve its qualified data points from the candidate data points for Q c . Lemma 1. Each query point q in a query cluster Q c can retrieve its qualified data points from the candidate data points for Q c .
Proof. We prove Lemma 1 by contradiction under the assumption that there exists a qualified data point p for query point q in Q c such that p is not a candidate data point for Q c . Clearly, set Σ(Q c ) of candidate data points for Q c is the union of set P(Q c ) of data points inside Q c and the SP query result SPQ(b, Q c ) at each border point of Q c as follows: Clearly, this data point p must be outside Q c . This is because as illustrated in Figure 6b, qualified data point p inside Q c becomes a candidate data point for Q c according to the definition of Σ(Q c ). When qualified data point p is outside Q c as illustrated in Figure 6a, the following two cases should be considered: ). In the first case, i.e., ∃p((q RN ∈ Q c ∧ p ∈ ∏(q RN )) → p / ∈ Σ(Q c )), qualified data point p satisfies the range query q RN ; however, it is not a candidate data point for Q c . In the second case, i.e., ∃p((q NN ∈ Q c ∧ p ∈ ∏(q NN )) → p / ∈ Σ(Q c )), qualified data point p satisfies the NN query q NN ; however, it is not a candidate data point for Q c . The shortest path from q RN to p should pass through a border point of Q c . For convenience, assume that the shortest path from q RN to p is q RN → b l → p where b l is a border point of Q c . Note that the distance from q RN to p is less than or equal to query distance r, i.e., dist(q RN , p) ≤ r. Thus, the distance from the border point b l to p is also less than or equal to r, i.e., dist(b l , p) ≤ r. This leads to a contradiction to the assumption that the qualified data point p for q RN is not a candidate data point for Q c . Next, consider the second case that the qualified data point p for q NN is not a candidate data point for Q c . For convenience, assume that the shortest path from q NN to p is q NN → b l → p and that a data point p l is the NN of b l rather than p. This means that p l is closer to b l than p, i.e., dist(b l , p l ) < dist(b l , p). Note that the shortest path from q NN to p (p l ) is q NN → b l → p (q NN → b l → p l ). Thus, p l should be the NN of q NN rather than p. This leads to a contradiction to the assumption that p is the NN of q NN . Therefore, each query point q in a query cluster Q c can retrieve its qualified data points from the candidate data points for Q c . Table 2 compares the time complexities of UBA and sequential algorithms, such as INE [3] and RNE [3], for dynamic spatial networks. Note that UBA is independent of the one-query-at-a-time processing algorithms [3,[11][12][13][14][15] and can be easily incorporated into these algorithms. For simplicity, INE and RNE are considered to evaluate a single SP query in dynamic spatial networks, and their time complexity is O(|E|+|V| · log|V|). UBA evaluates as many as M· Q SP queries, where Q is the number of query clusters in Q and M is the maximum number of border points in Q C , i.e., M = max{|B(Q C )| | Q C ∈ Q}. Conversely, sequential algorithms evaluate as many as |Q| SP queries because each query point should be handled individually. Thus, the time complexities of UBA and the sequential algorithms are O(|Q| · (|E| + |V| · log|V|)) and O(|Q| · (|E| + |V| · log|V|)), respectively. The results of the time complexity analysis indicate that UBA is superior to sequential algorithms, particularly when |Q| |Q|, i.e., the query points exhibit a highly skewed distribution. In addition, the results demonstrate that UBA shows similar performance to sequential algorithms when |Q| ∼ = |Q|, i.e., the query points exhibit a uniform distribution.

Evaluation of Example SP Queries Using UBA
This section describes the process used to evaluate five example SP queries using UBA. As shown in Figure 5, Table 3 shows the results of the SP queries at the three border points v 1 , v 2 , and v 5 .
The Segment_search algorithm (Algorithm 3) is called for each query segment in Q C . For convenience, the three query segments q NN 1 q RN 2 , q NN 3 , and q RN 4 q NN 5 are processed sequentially. First, the segment_search function evaluates the SP queries in q NN 1 q RN 2 with the candidate data points in This function computes the distance between each pair of query points q NN 1 and q RN 2 in q NN 1 q RN 2 , and the candidate data points p 2 and p 4 . Table 4 summarizes the distances between each pair of query points q in query segment q i q j and their candidate data points p. Here, the SP query q NN 1 finds the data point closest to q NN 1 from the candidate data points p 2 and p 4 . Consequently, p 4 is the chosen NN of q NN 1 because p 4 is closer to q NN 1 than p 2 (Table 4). Similarly, the SP query q RN 2 locates data points within a query distance r = 4 to q RN 2 . Accordingly, p 2 is included in the result of q RN 2 because dist(q RN 2 , p 2 ) = 4 and dist(q RN 2 , p 4 ) = 11 ( Table 4). The query result for q NN First, the distance between each pair of query points q NN 3 and then candidate data points p 1 and p 2 is computed. Then, the SP query q NN 3 locates the data point that is closest to q NN 3 in p 1 and p 2 . Consequently, p 1 is the chosen NN of q NN 3 because p 1 is closer to q NN 3 than p 2 ( Table 4). The query result for q NN Finally, the segment_search function evaluates the SP queries in q RN 4 q NN 5 using the candidate data points in First, the distances between each pair of query points in q RN 4 q NN 5 and then the candidate data points p 1 and p 4 are calculated. The SP query q RN 4 locates the data points within a query distance r = 4 to q RN 4 . No data points belong to the result set of q RN 4 because dist(q RN 4 , p 1 ) = 9 and dist(q RN 4 , p 4 ) = 8 ( Table 4). The SP query q NN 5 identifies the data point that is closest to q NN 5 in p 1 and p 4 . Consequently, p 1 is the chosen NN of q NN 5 because p 1 is closer to q NN 5 than p 4 ( Table 4). The query result for q RN 4 Clearly, the results of the SP queries in Q are the union of the results for the query segments in Q C :

Performance Study
In this section, the results from an empirical analysis of UBA are presented and compared with those of the conventional method [3]. The experimental settings are described in Section 5.1 and the experimental results are presented in Section 5.2.

Experimental Settings
Three real-world spatial networks [52] (Table 5) were used for the empirical study. These real-world spatial networks have different sizes and are part of the United States road network. For convenience, the extents of the spatial networks were normalized to a unit square [0, 1] 2 , and the query distance r was set to 10 −2 . The query points followed a centroid distribution, and the data points followed either a centroid or uniform distribution. Here, centroid-based points were generated to mimic highly skewed distributions of POIs in the real world. First, the centroids c 1 , c 2 , . . . , c |C| were selected randomly based on the extent of the spatial networks, where |C| is to the number of centroids. The points around each centroid followed a normal distribution, with the mean indicating the centroid, and the standard deviation was set to σ = 10 −2 . A total of 1-10 centroids were selected as the query points, and five centroids were selected as the data points. The number of NN queries was the same as that of the RN queries for the SP queries. The experimental parameters are listed in Table 6. In each experiment, a single parameter was varied within the range, and the other parameters were maintained at their default values (shown in bold).

Parameter Range
Number of query points (|Q|) 1, 3, 5, 7, 10 (×10 3 ) Number of data points (|P|) 1, 3, 5, 7, 10 (×10 3 ) Distribution of query points in Q (C)entroid Distribution of data points in P (U)niform, (C)entroid Number of centroids for query points in Q 1, 3, 5, 7, 10 Number of centroids for data points in P 5 Standard deviation for normal distribution (σ) 10 −2 Query distance (r) 10 −2 Number of NN queries in Q 0.5 × |Q| Roadmap CAL, FLA, COL Next, the proposed UBA was compared in terms of query processing time and the number of evaluated SP queries to a sequential algorithm called SEQ, which computes SP queries sequentially. Here, it was assumed that the query and data points moved freely within the dynamic spatial networks. Note that it is impractical to exploit the precomputation techniques presented in the literature [12,13,15] because the precomputed distances might be invalidated frequently when the query and data points run freely within a dynamic spatial network. UBA and SEQ use common subroutines for similar tasks, e.g., the evaluation of SP queries at a single query point; thus, both algorithms were implemented in C++ using the Microsoft Visual Studio 2019 development environment. The experiments were executed on a desktop computer running the Windows 10 operating system with 32 GB RAM and a 3.1 GHz processor (i9-9900). As in many recent studies [11,26,53], the indexing structures for UBA and SEQ remained in main memory to provide prompt responses, which are crucial in online map services. The experiments were repeated 10 times, and the average processing time was measured to determine the SP queries in Q. As stated previously, the proposed UBA is orthogonal to one-query-at-a-time processing algorithms [3,[11][12][13][14][15] and can be easily incorporated into these algorithms. In this study, INE [3] and RNE [3] were used to evaluate the NN and RN queries, respectively, for the dynamic spatial networks because INE and RNE are based on network expansion similar to Dijkstra's algorithm, which is well-suited to dynamic spatial networks. Figure 7 compares the query processing times of UBA and SEQ to evaluate the SP queries in the CAL roadmap. In Figures 7-9, the three upper-row and three bottom-row charts show the experimental results when the data points followed a uniform distribution and a centroid distribution, respectively. Each chart shows the query processing time and number of evaluated SP queries by varying one parameter at a time ( Table 6). The values in parentheses in Figures 7-10 indicate the number of SP queries evaluated by the proposed UBA. Note that the numbers of SP queries evaluated by SEQ were omitted because these numbers were exactly equal to |Q| of the SP queries in Q. Figure 7a shows the query processing times of UBA and SEQ when |Q| of the query points was between 1 K and 10 K, i.e., 1 K ≤ |Q| ≤ 10 K. As can be seen, the proposed UBA clearly outperformed SEQ as the number of SP queries in Q increased. In terms of query processing times, UBA was up to 2.9 times faster than SEQ for |Q| = 7 K. However, UBA was up to 2.59 times slower than SEQ for |Q| = 1 K. Note that the proposed UBA was not sensitive to |Q|, unlike SEQ, which means that the effectiveness of batch processing in UBA increased as |Q| increased. When |Q| = 1 K, 3 K, 5 K, 7 K, and 10 K, UBA evaluated fewer SP queries than SEQ by 75%, 89%, 88%, 91%, and 92%, respectively. Figure 7b shows the query processing times when |P| of data points was varied between 1 K and 10 K, i.e., 1 K ≤ |P| ≤ 10 K. Thus, UBA clearly outperformed SEQ in all cases. The query processing times of UBA were up to 8.9 times lower than those of SEQ when |P| = 1 K. As the |P| value decreased, the search space for the NN query processing increased. Regardless of the change in |P|, UBA and SEQ evaluated 789 and 10,000 SP queries, respectively. Figure 7c shows the query processing times when |C| of the centroids for the query points was varied between 1 and 10, i.e., 1 ≤ |C| ≤ 10. The proposed UBA was up to 2.3 times faster than SEQ for all cases. As |C| increased, the difference in query processing times between UBA and SEQ decreased because increasing |C| led to a reduced density of the query points, which resulted in an increased |Q| value. Specifically, when |C| =1, 3, 5, 7,   Figure 7d-f show the query processing times of UBA and SEQ when the data points followed a centroid distribution. The query processing times of the proposed UBA were up to 18.95 times lower than those of SEQ for all cases. Unlike the case shown in Figure 7a, the query processing times of UBA and SEQ did not increase with |Q|, as shown in Figure 7d, which means that the query processing time was more sensitive to the distribution of data points than |Q| when the data points followed a highly skewed distribution. When |Q| = 1 K, 3 K, 5 K, 7 K, and 10 K, the query processing times of UBA were 21.7, 162.8, 21.9, 126.8, and 468.7 s, respectively. As shown in Figure 7d-f, UBA was faster than SEQ in all cases. The difference in query processing times between UBA and SEQ for a centroid distribution of data points was up to several orders of magnitude greater than that for a uniform distribution of data points. Figure 8 compares the query processing times obtained when using UBA and SEQ to evaluate the SP queries in the FLA roadmap. Figure 8a shows the query processing time as a function of |Q|. We found that the proposed UBA was up to 2.2 times faster than SEQ for |Q| ≥ 3 K. However, SEQ was 2.7 times faster than UBA for |Q| = 1 K because the batch processing of UBA was for a large number rather than a small number of SP queries. Figure 8b shows the query processing time as a function of |P|. UBA was 5.5 and 2.2 times faster than SEQ for |P| = 1 K and 10 K, respectively, even though UBA and SEQ evaluated 1601 and 10,000 SP queries, respectively, for these two cases. This is because the search space for the NN queries when |P| = 1 K was greater than that when |P| = 10 K. Figure 8c shows the query processing time as a function of |C|, which, for UBA was up to 2.1 times shorter than that of SEQ in all cases. Clearly, the number of query clusters increased with |C|, which adversely affected the performance of the proposed UBA. As shown in Figure 8d-f, UBA was up to 11 times faster than SEQ in all cases. The query processing times of both UBA and SEQ fluctuated, which means that the distribution of highly skewed data points affected the NN query processing time significantly. Specifically, as shown in Figure 8d, the query processing time of UBA for |Q| = 1 K was 8.9 times longer than that for |Q| = 3 K despite the difference in the number of SP queries in Q.   Figure 9 compares the query processing times obtained using UBA and SEQ with the COL roadmap. As shown in Figure 9a, the proposed UBA was up to 3.1 times faster than SEQ when 5K ≤ |Q| ≤ 10 K. Here, as |Q| increased, UBA was superior to SEQ. As shown in Figure 9b, UBA was up to 16.3 times faster than SEQ regardless of the |P| value because UBA and SEQ evaluated 409 and 10,000 SP queries, respectively. Clearly, this difference in the number of evaluated SP queries (i.e., 9591) occurred the proposed UBA can exploit the batch processing of the clustered SP queries; thus, unnecessary distance computations can be avoided. As shown in Figure 9c, UBA clearly outperformed SEQ in all cases of |C|. As |C| increased, the density of the query points decreased, which was ineffective for the batch processing of UBA. As shown in Figure 9d-f, UBA was up to 26.6 times faster than SEQ in all cases. As shown in Figure 9d, the query processing times of UBA and SEQ fluctuated significantly because the highly skewed distributions of data points affected the search space of the NN queries significantly.

Experimental Results
Two versions of UBA, i.e., UBA SEG and UBA CLS , were implemented and evaluated to investigate the effect of the two-step clustering method on the batch processing of UBA and its scalability in terms of |Q|. UBA SEG transforms nearby query points into query segments, and UBA CLS transforms nearby query points into query clusters. UBA SEG and UBA CLS are illustrated in Figure 5a,b, respectively. Figure 10 compares the query processing times using UBA SEG and UBA CLS with the CAL roadmap, where the two values in the parentheses indicate the number of SP queries evaluated by UBA SEG and UBA CLS , respectively. As can be seen, the number of SP queries evaluated by UBA SEG was greater than that of UBA CLS . As shown in Figure 10a, when the data points exhibited a uniform distribution, UBA SEG was up to 6.1 times faster than UBA CLS for 1 K ≤ |Q| ≤ 10 K. However, as |Q| increased, UBA CLS was faster than UBA SEG , which means that UBA CLS scaled better than UBA SEG with |Q|. Specifically, UBA CLS was 1.5 times faster than UBA SEG for |Q| = 100 K. As shown in Figure 10b, when the data points exhibited a centroid distribution, UBA CLS was up to 2.2 times faster than UBA SEG in all cases. Therefore, UBA CLS scaled with |Q| better than UBA SEG . It is clear that the distribution of data points affected query processing time significantly. Specifically, when the data points exhibited uniform and centroid distributions, the query processing times of UBA CLS were 1.5 and 497.7 s, respectively, for |Q| = 100 K.

Conclusions
This paper has proposed the UBA to efficiently process SP queries comprising NN and RN queries in dynamic spatial networks. The goal of the proposed UBA is to avoid dispensable distance computations during batch processing. Accordingly, UBA performs two-step clustering of SP queries and their batch processing to reduce the number of SP queries evaluated for query clusters. The experimental results have confirmed that the proposed UBA outperformed a conventional algorithm based on one-query-at-a-time processing and scaled well with the number of queries. We found that the proposed UBA was up to 26.6 times faster than the compared conventional algorithm. The proposed UBA has several advantages. First, UBA avoids dispensable network traversal by clustering SP queries and performing batch processing. Second, UBA can easily be incorporated into one-query-at-a-time processing algorithms for spatial networks [3,12,13,15]. However, the proposed UBA also exhibits a disadvantage, i.e., its performance is very sensitive to the distribution of query points. Thus, UBA demonstrates similar performance to that of sequential algorithms, particularly when the query points exhibit a uniform distribution. The proposed UBA clearly outperforms sequential algorithms when the query points exhibit a highly skewed distribution. In future, we plan to apply this unified batch solution to extremely large spatial networks for distributed batch processing of sophisticated spatial queries, e.g., spatial join queries [54] and spatial keyword queries [2,50].