The MCR problem requires selecting the best location for a station to cover as much as spatial objects as possible. Intuitively, the optimal location can be located at any point in the road network. Apparently, it is prohibitively expensive to enumerate all points, which is infinite. To tackle this challenge, we first present an effective exact method for the MCR problem. To this end, we first introduce how to select the best point on a given edge that covers the maximum number of spatial objects.
2.2.1. Searching an Edge for the Optimal Location
To search for the point that covers the maximum number of spatial objects of an edge, a naive idea is to evaluate the number of objects that are covered by each point on this edge. For instance, take
Figure 2a as an example, where the blue point
p is a point on the edge
. For the blue point
p on the edge
, we explore close objects within the given radius by expanding from
p. The coverage of
p is highlighted in red. As can be seen from the figure, objects
and
are covered by
p. By repeating this procedure for every point on the edge
, we can find the optimal location on this edge. Unfortunately, as there is an infinite number of points, this idea is prohibitively expensive.
To address the above issue, we propose the most overlapped interval (MOI) problem. We then show that we can reduce the problem of searching for the optimal point on the edge to solve the most overlapped interval problem, which can be solved efficiently. We next give the formal definition of the MOI problem and illustrate how to perform the reduction with a detailed example.
Definition 4 (Most Overlapped Interval). Given an edge , a segment is a part of the edge bounded by two endpoints and on the edge. Let be a set of segments on the edge. The most overlapped interval problem aims to find the interval , , such that overlaps with the maximum number of segments.
Example 3. Consider the example in Figure 3, where there is one edge and four segments. The endpoints of the segments divide the edge into six intervals. We notice that intervals and overlap with three segments, which is more than other intervals. Thus, both and are the most overlapped intervals on the edge. To explain how we can reduce the problem of searching for the optimal point on the edge of the
MOI problem, we first present the following observation. Consider edge
in the example in
Figure 2a. We observe that spatial object
is in the coverage of point
p (highlighted in red). Inversely, if we draw the coverage of
, we notice that point
p is located at the interval
, which is the overlap between
’s coverage and the edge
. In fact, any point on the interval
can cover
. Motivated by this observation, a natural idea to identify the overlap between
and every spatial object’s coverage. Each overlap corresponds to a segment on the edge
. We can then find the most overlapped interval. Any point on this interval is an optimal location on the edge
that covers the maximum number of spatial objects. The following example demonstrates the reduction from
Figure 2a.
Example 4. There are three spatial objects (, , and ) in Figure 2a. For each spatial object, we draw the coverage of , , and , which are highlighted in red, gray, and green, respectively. Each object’s coverage overlaps with edge and the resulting segments divide the edge into three different intervals, denoted as , , and . Interval overlaps with the red and the green segments, while overlaps with the green and the gray segments. Thus, and are the most overlapped intervals. We can pick a point p from and return it to users as a result. We showed how to reduce the problem of searching an edge for the optimal point to the most overlapped interval problem. Two questions are to be answered: (1) Which spatial objects should we consider to compute the overlap between the edge and their coverage so that we can generate the segments for the MOI problem? (2) How do we address the MOI problem efficiently? We next elaborate on the two questions in turn.
As we demonstrated above, the input of the reduced MOI problem is the set of segments, each of which is the overlap between a spatial object’s coverage and the edge. Some spatial objects are very far from the edge, such that any point on the edge cannot reach the object. It is unnecessary to consider such spatial objects while we are generating the overlapped segments for the MOI problem. Thus, to identify the valid spatial objects to generate segments for the reduced MOI problem on the edge , we can divide the spatial objects into three classes as follows:
Case 1: Irrelevant objects. A spatial object (o) is an irrelevant object to the edge if no point on can reach o within the radius . As a result, we do not need to consider such objects when searching for the most overlapped interval. We can easily derive the following lemma to verify whether a spatial object is irrelevant.
Lemma 1. If and , then o is an irrelevant object to .
Proof. For any point p on , its distance to o is . Since , , it is obvious that . Thus, o is an irrelevant object. □
Case 2: Fully-covered objects. A spatial object (o) is a fully-covered object to edge if any point on can reach o within the radius . Thus, it is unnecessary to consider o anymore when searching the edge for the most overlapped interval. We derive the following lemma to verify whether o is a fully-covered object.
Lemma 2. An object (o) is a fully-covered object toif at least one of the following conditions holds:
- 1.
;
- 2.
;
- 3.
.
Proof. It is obvious that
o is a fully-covered object when the first two conditions hold. For the third condition, consider a point
p on the edge
. We
Since
, we
, i.e., any point
can reach
o within the radius
. □
Case 3: Partially covered objects. A spatial object (o) is a partially covered object to edge if some points on can reach o within the radius, while the other points cannot. Therefore, to search for the optimal point on the edge , we need to find all partially covered objects and determine the most influenced interval. To verify whether an object is a partially covered object to , we present the following lemma.
Lemma 3. An object (o) is a partially covered object if the following two conditions hold:
- 1.
- 2.
, or
Proof. According to the definition of partially covered objects, we consider the following three cases: (1) As shown in
Figure 4a, only the points on the interval
can reach
o within
through node
u; (2) only the points on the interval
can reach
o within
through
v; and (3) points on both intervals
and
can reach
o within
through
u and
v, respectively. By considering the three cases, we can easily derive the inequalities in the lemma. □
The above three cases indicate that we only need to consider the segments concerning the intersection between the partially covered object’s coverage and the edge .
We showed how to identify the set of spatial objects whose coverage on the edge should be considered in the MOI problem. With the MOI problem constructed, how can one efficiently find the most overlapped interval? To answer this question, one straightforward idea is to maintain a counter for each interval and iterate over all segments to update the counter of every overlapped interval. We refer to this algorithm as the MOIEnumerate algorithm.
MOIEnumerate Algorithm. Algorithm 1 shows the pseudocode of the straightforward idea. The algorithm takes as input a set of segments on the edge and outputs the most overlapped interval . The algorithm first collects the endpoints from all segments (lines 1–3) and sorts the endpoints (line 4). It computes all intervals bounded by every two consecutive endpoints (line 5). Next, the algorithm iterates over all segments (lines 6–8). For each segment, it computes the overlapped intervals (line 7) and increases the associated counters by 1 (line 8). When all segments are processed, we choose the interval with the largest counter (line 9) and return it to the user as the final result.
Algorithm 1: MOIEnumerate |
|
Complexity Analysis. It takes to sort the endpoints of all segments, where is the number of segments. The endpoints divide the edge into intervals. Then, for each segment, we need to locate the intervals that are covered by the segment and update the counter for each interval. Thus, it takes time to process one segment, and there are segments to be processed. Therefore, the total complexity of Algorithm 1 is .
The MOIEnumerate algorithm is easy to implement. However, an expensive operation in the algorithm is the counter-updating operation, i.e., for each segment, we need to identify the overlapped intervals and update their counters in turn. The number of overlapping intervals is ; thus, the MOIEnumerate algorithm scales poorly with respect to the number of segments.
MOILinearScan Algorithm. To tackle this efficiency issue of the MOIEnumerate algorithm, we next analyze the MOI problem and design an efficient linear scan-based algorithm, named MOILinearScan. We use the following example to explain the intuition behind the algorithm.
Example 5. Consider the example in Figure 5. The endpoints of the segments divide the edge into several disjoint intervals. In the example, , …, are the endpoints of four segments. We mark the left endpoints in red and the right endpoints in blue. Assume we move a sweep point from u to v (from left to right). The sweep point first meets the left endpoint of segment . From now on, the sweep point overlaps with segment . We keep moving the sweep point until it meets the left endpoint of segment . The sweep point overlaps with and now. Similarly, when the sweep point meets the left endpoint of segment , the sweep point overlaps with segments , , and . Then, the next endpoint the sweep point meets is , which is the right endpoint of segment . After that, the sweep point only overlaps with and . We keep moving the sweep point until it reaches the end of this edge, i.e., v.
Assume a sweep point is moving from left to right. From the above example, we can observe the following properties:
Property 1. When the sweep point meets a left endpoint of any segment, the next interval is overlapped with this segment.
Property 2. When the sweep point meets the right endpoint of any segment, the next interval is no longer overlapped with this segment.
The two properties tell the status of consecutive intervals. Recall that each interval is bounded by two endpoints. If the endpoints do not overlap with each other, the difference between the number of overlapped segments of two consecutive intervals is 1. For instance, overlap with one segment, while overlap with two segments. Similarly, overlap with three segments, while overlap with two segments.
From the two properties, we derive the following lemma, which is the key to the MOILinearScan algorithm.
Lemma 4. Among all intervals on the edge , the interval between a left endpoint and a right endpoint has the maximal number of overlapped segments, compared to its adjacent intervals.
Proof. The two properties indicate the set of overlapped segments of such an interval and its adjacent intervals. Thus, the lemma is easily proved. □
This inspired us to use a design a sweep point-based method to search for the most overlapped interval. Specifically, we use a sweep point to scan the edge from left to right. During the sweeping, we maintain a counter to record the number of overlapped segments, keeping in mind the type of the recently passed endpoint, i.e., whether it is a left endpoint or a right endpoint of a segment. If we encounter an interval bounded by a left endpoint and a right endpoint consecutively, it is a maximal interval, which means the overlapped segments of this interval are larger than the adjacent intervals. We refer to this algorithm as MOILinearScan.
Algorithm 2 shows the pseudocode of MOILinearScan. The algorithm takes (as input) a set of segments on the edge , and outputs the most overlapped interval . Initially, the algorithm collects the left and right endpoints of all segments and stores them in lists L and R, respectively (lines 2–3). Both lists are sorted in ascending order (line 4). The variable is initialized as , representing the last endpoint the sweep point has encountered, and counter (lines 5–6). Next, it iteratively pops endpoints from L and R, representing moving the sweep point from left to right (lines 7–15). In each iteration, if , it means the sweep point meets a left endpoint, and the following interval’s counter will be increased by 1 (lines 8–10). Otherwise, it means the sweep point meets the right endpoint and the current interval’s counter is larger than its adjacent intervals. If the current interval’s counter is larger than the record , we record the interval and its counter with and . When all left endpoints are processed, we can terminate the iteration (lines 16–17) and return to users.
Example 6. Consider the example in Figure 5. The two lists are sorted as , . The variable is initialized as , and is initialized as 1. In the first iteration, is ‘popped’ from L, as . We update , . In the second iteration, is ‘popped’ from L, and , . In the third iteration, is ‘popped’ from R. In this case, the interval is a maximal interval. We use and to record this interval. By repeating this process, we can know the counter for every interval. Since interval has the maximum counter, we return it to the user as the final result. Complexity Analysis. It takes time to sort the left/right endpoints of the segments. We next process an endpoint from either L or R in every iteration, and there are iterations. Putting these together, the total complexity of Algorithm 2 is .
Algorithm 2: MOILinearScan. |
|
2.2.2. Searching the Whole Road Network for the Optimal Location
In the previous subsection, we demonstrated how to search an edge for the optimal location, such that a station placed on that location covered the maximum number of spatial objects. Thus, in order to search the whole road network for the optimal point, the natural idea is to repeat the previous procedure on every edge of the road network. For instance, for every edge, we identify the partially covered objects and construct the corresponding MOI problem and invoke Algorithm 2 to solve it. Unfortunately, this idea involves many redundant operations, and it is unnecessary to examine every edge. In this subsection, we present a method to estimate an upper bound for every edge to show the maximum number of objects that a station on the edge can cover. With the upper bound, we can examine the edges in a greedy manner, which greatly reduces the search space.
The upper bound of an edge tells us what is the maximum number of covered objects if we place a station on the edge. A straightforward idea to define the upper bound is based on partially covered objects and fully-covered objects that we introduced in
Section 2.2.1. Recall that fully-covered objects are the set of spatial objects that any point on the edge can reach within the radius, while partially covered objects are the set of spatial objects that only a part of points on the edge can reach within the radius. Therefore, The total number of fully-covered and partially covered objects is an upper bound for the maximum covered objects if we place a station on this edge. However, the spatial objects in the road network are usually very large and could emerge rapidly. It would be expensive to check every spatial object’s status. In contrast, the road network is relatively small and seldom changes. As a result, we propose computing the upper bounds on the edge level.
To this end, a key question to be answered is: what is the upper bound of the distance between two points
and
, where
and
are from two different edges? To answer this question, we use the example in
Figure 6 to explain. In the example,
and
are two edges, and
and
are two points on the two edges, respectively. We can easily derive the distance between
and
as
As it is obvious that
,
,
,
, where
/
is the length of edge
/
. We define the function
as follows
We can easily derive that
The computation of function only relies on the edge length ( and ) and node pair distances (, , , ), which means they can be efficiently computed. Moreover, as they give an upper bound for the maximum point distances between two edges, they can be used to estimate the coverage number upper bound on an edge. Specifically, given an edge e, we identify the set of candidate edges, such that for each edge , we . Apparently, if an edge , it means no matter which location in e is the station placed, it cannot cover the spatial objects in , as their distance would be larger than . We can simply add up the number of spatial objects in every as the coverage number upper bound.
With the upper bound defined, we can search the graph in a greedy manner. The edges with higher upper bounds are searched first. The search is terminated if the upper bounds of the remaining edges are less than the coverage number of the current optimal point. Algorithm 3 shows the pseudocode for searching the whole graph. It takes as input the road network and a radius , and outputs the station location that covers the maximum number of spatial objects. It uses a priority queue h to keep the edges and their coverage number upper bound and h is initialized as empty (line 1). Initially, for each edge e, it computes the coverage number upper bound and pushes it into the priority queue (lines 2–5). Specifically, it first identifies the set of edges such that the distance upper bound does not exceed for every (line 3). Then it computes the coverage number upper bound by adding up the number of objects on every edge in (line 4). With all upper bounds computed and pushed into h, it starts searching greedily (lines 7–14). It uses to record the currently known optimal location and to record its coverage number. In each iteration, it pops edge e with the maximum upper bound from h (line 8). If the upper bound is less than , it is unnecessary to examine the remaining edges and the search is terminated (liens 9–10). Otherwise, it constructs the corresponding MOI problem on the edge e by generating the segments on e from the objects on (line 11). Then it invokes Algorithm 2 to solve the MOI problem and find the optimal point p and its coverage number on the edge e (line 12). The current optimal point is updated accordingly if (lines 13–14). When the search is finished, it returns and to the users as the final result.
Remark 2. In Algorithm 3, in order to compute the coverage number upper bound for each edge, we need to identify the edge set (line 3). For each edge , we compute , which involves many node pair distance computations. As an edge can only be added into if , we do need to compute all pair-wise distances in the road network. In contrast, we only need to perform a length-constrained Dijkstra algorithm for each node, i.e., compute the set of nodes whose distance is less than γ. This is much more efficient.
Algorithm 3: MCRExact. |
|
Complexity. To compute the coverage upper bound, we need to perform a length-constrained Dijkstra algorithm for every node. Let be the average complexity for such a length-constrained Dijkstra algorithm. It takes to initialize the priority queue h. During the search, we need to compute the segments on the edge and invoke Algorithm 2 to solve the constructed MOI problem. Let be the number of generated segments. It takes to finish one iteration. Let k be the number of iterations that we need to search. The total time complexity is .