Exploiting Obstacle Geometry to Reduce Search Time in Grid-Based Pathﬁnding

: Pathﬁnding is the problem of ﬁnding the shortest path between a pair of nodes in a graph. In the context of uniform-cost undirected grid maps, heuristic search algorithms, such as A ⋆ and weighted A ⋆ ( WA ⋆ ), have been dominantly used for pathﬁnding. However, the lack of knowledge about obstacle shapes in a gird map often leads heuristic search algorithms to unnecessarily explore areas where a viable path is not available. We refer to such areas in a grid map as blocked areas (BAs). This paper introduces a preprocessing algorithm that analyzes the geometry of obstacles in a grid map and stores knowledge about blocked areas in a memory-efﬁcient balanced binary search tree data structure. During actual pathﬁnding, a search algorithm accesses the binary search tree to identify blocked areas in a grid map and therefore avoid exploring them. As a result, the search time is signiﬁcantly reduced. The scope of the paper covers maps in which obstacles are represented as horizontal and vertical line-segments. The impact of using the blocked area knowledge during pathﬁnding in A ⋆ and WA ⋆ is evaluated using publicly available benchmark set, consisting of sixty grid maps of mazes and rooms. In mazes, the search time for both A ⋆ and WA ⋆ is reduced by 28%, on average. In rooms, the search time for both A ⋆ and WA ⋆ is reduced by 30%, on average. This is achieved while preserving the search optimality of A ⋆ and the search sub-optimality of WA ⋆ .


Introduction
Grid-based pathfinding has been the subject of considerable interest in a number of fields such as video games and robotics navigation. A ⋆ [1] is a simple, best-first search algorithm that relies on a heuristic function to guide the search towards finding the optimal path between a source node and a goal node in a grid map. To reduce the execution time of the A ⋆ algorithm, researchers typically focus on finding new heuristic functions that reduce the number of visited nodes (i.e., search space) during pathfinding.
A ⋆ is optimal if the used heuristic function is admissible, i.e., never overestimates when predicting the distance to reach the goal [2]. Weighted A ⋆ (WA ⋆ ) [3] relaxes the admissibility rule and multiplies the heuristic function by a factor > 1. While doing so might lead to finding sub-optimal paths, inflating the heuristic forces the search algorithm to prioritize exploring more promising paths rather than exploring every possible path to guarantee optimality. For example, Figure 1 compares the number of visited nodes in both A ⋆ and WA ⋆ (with = 3) when trying to find the same path in an 8-neighbor grid map of a 199 × 364 maze. Both algorithms use the octile-distance heuristic, which is a commonly-used admissible heuristic function that allows both straight and diagonal movements. As shown by Figure 1c, the greedy nature of WA ⋆ led it to prioritizing the exploration of the closer nodes to the goal, which resulted in finding a different, sup-optimal path. By contrast, A ⋆ slowly explores all similar nodes, i.e., nodes with the same cost, to guarantee optimality (as shown by Figure 1b).
WA ⋆ is a simple yet effective extension of A ⋆ and is still in wide use to this day [4,5]. However, WA ⋆ , as well as A ⋆ , has no knowledge about the geometry of obstacles in a grid map, which could mislead the search into exploring paths with blocked ends. For example, Figure 2a shows twenty two polygon-shaped areas with blocked ends in the maze of Figure 1a. We refer to such shapes as blocked areas (BAs) because they are bound by obstacles from all directions, except for the one direction where the search may enter this area. This paper aims to make knowledge about blocked areas in a grid map available during pathfinding. By doing so, a search algorithm can avoid exploring useless paths inside blocked areas, which in turn reduces search time. For example, Figure 1 shows that both A ⋆ and WA ⋆ have wastefully explored most of the blocked areas identified in Figure 2a. By comparison, Figure 2b,c show the potential of using the blocked areas' knowledge into guiding both search algorithms to avoid exploring blocked areas and thus significantly reducing the number of visited nodes while performing pathfinding. Visited nodes (shown in dark grey) during optimal pathfinding using A ⋆ and sub-optimal pathfinding using weighted A ⋆ (WA ⋆ ) (with = 3.0) for the path shown in red. Note that the source node is at the right-bottom and the goal node is the left-bottom of the map.
(b) A ⋆ + BA (c) WA ⋆ + BA Figure 2. The impact of using the blocked areas' knowledge on reducing the number of visited nodes in optimal and sub-optimal pathfinding for the same path in Figure 1.
To make use of the blocked areas' knowledge, we propose the following approach. First, a preprocessing algorithm investigates the geometry of obstacles in a grid map and identifies blocked areas. Next, information about blocked areas is stored in a memory-efficient balanced binary search tree, referred to as the BA-tree. During actual pathfinding, a search algorithm accesses the BA-tree to determine if a particular node is inside a blocked area. If so, then this node is not explored, i.e., eliminated from the search space.
An important property of our proposed method is that it does not depend on any specific heuristic function. Instead, it utilizes the geometry of obstacles to eliminate irrelevant parts to the search, i.e., in a sense, it reduces a map to a new (more concise) map that a heuristic search algorithm can investigate at a faster speed using its original heuristic function. Therefore, our method is orthogonal to the search algorithm itself, and hence can be combined with many heuristic search algorithms in the literature. As a proof of concept, in this paper, we present and evaluate our proposed method using A ⋆ and WA ⋆ search algorithms combined with the octile-distance heuristic. Another important property of our approach is that it preserves the optimality of a search algorithm. We present mathematical proofs of this claim in Section 4.
In addition, our approach has a small memory overhead because nodes in a grid map need not store any information about blocked areas, which would scale poorly for large maps. Instead, during pathfinding, a search algorithm retrieves this information from the BA-tree, where the memory requirements are bound by a small fraction of nodes in a grid map, as will be explained by Section 5. While the concept of blocked areas can be generalized to any obstacle shape, in this paper, we consider maps where obstacles are represented as vertical and horizontal line-segments, which in turn form blocked areas that are polygons. Such cases are commonly found in maps of mazes, buildings and transportation maps.
We evaluate the impact of using the blocked areas' knowledge for A ⋆ and WA ⋆ using a publicly available benchmark set that includes sixty maps of mazes and rooms [6]. Our evaluation shows that the search space (i.e., the number of visited nodes during pathfinding) of A ⋆ is reduced by 34%, on average, for both mazes and rooms. This results in a significant reduction in search time for both A ⋆ and WA ⋆ . Specifically, in mazes, the execution times of both A ⋆ and WA ⋆ were reduced by 10-35% (the average is 28%). In rooms, the execution times of both A ⋆ and WA ⋆ were reduced by 5-57% (the average is 30%). Our evaluation also demonstrated that the memory overhead associated with storing blocked areas' information in memory during preprocessing and the time associated with accessing this information during pathfinding are both small.
The remainder of this paper is organized as follows: Section 2 surveys related work. Section 3 provides background information about A ⋆ and WA ⋆ search algorithms. Section 4 presents the definition of blocked areas. Section 5 describes the proposed preprocessing algorithm for finding blocked areas in a grid map. Section 6 describes the proposed algorithms for constructing and accessing the BA-tree. Section 7 evaluates the impact of using the blocked areas's knowledge in reducing execution time for both A ⋆ and WA ⋆ . Section 8 concludes the paper.

Related Work
Several heuristic search algorithms in the literature, such as A ⋆ and WA ⋆ , assume no prior knowledge about maps. However, in many grid-based pathfinding domains, maps have static nature, i.e., their territories remain mostly the same. Examples of such domains are video games and city maps. Therefore, preprocessing approaches where prior knowledge about grid maps are utilized for the purpose of speeding up pathfinding has attracted many researchers due to the amortized runtime overhead (preprocessing needs to be executed only once).
In general, the efficiency of preprocessing approaches for pathfinding depends on the following: (i) the type of knowledge collected; (ii) the runtime overhead of accessing this knowledge during actual pathfinding; and (iii) the memory overhead of storing this knowledge. Below, we survey previously proposed preprocessing methods and describe their difference to our work.
Some researchers proposed preprocessing approaches where optimal paths between all or some nodes are pre-calculated and stored in a database, which is looked up during actual pathfinding [7][8][9][10]. While pathfinding in these approaches is fast and optimal, in general (despite using compression techniques) memory requirements are huge because memory is proportional to the number of nodes in a map. By contrast, the memory requirement in our work is bounded by the number of blocked areas, which is, in practice, a small fraction compared to the total number of nodes in a grid map.
Other researchers suggested using preprocessing to create an abstraction layer (or multiple layers) for each group of nodes in the map with pre-calculated local paths. During actual pathfinding, a path is found first using the abstract map. This path is then refined in a subsequent step for the original map. The returned path is, however, not guaranteed to be optimal. Examples of such works are [11][12][13].
A related approach to the abstraction method is introduced by [14], where all shortest paths between all pairs of nodes are abstracted (instead of abstracting each group of nodes in a grid map). This is done using a preprocessing step that identifies all subgoals in a map. A subgoal is a sequence of nodes such that a path between any pair of nodes inside a subgoal is optimal only if it is a part of the optimal path to the next subgoal. During actual pathfinding, an optimal high-level path between subgoals is found first. This path is then refined to an optimal low-level path.
Compared to abstraction-based approaches, our approach does not require any additional refinement steps when performing pathfinding. In addition, it needs not store in memory any knowledge about pre-computed local distances between nodes.
A similar work to ours is the dead-end heuristic [15], which describes areas that are irrelevant to the current search. During the preprocessing phase, the map is decomposed into smaller areas and a high-level abstract graph with nodes representing those small areas is created. During actual pathfinding, the search is split into two phases. The first phase identifies all nodes in the high-level graph that are relevant to the shortest path (the other nodes are called the dead-end areas). The second phase performs actual pathfinding while avoiding dead areas. Due to using an abstract graph, dead-end heuristic requires an extra step during pathfinding; a step that is not needed in our work.
Another similar work to ours is [16], which uses preprocessing to identify swamps, a collection of nodes that can be skipped during optimal pathfinding. Conceptually, swamps and blocked areas have the same definition. In addition, similar to our work, swamps require no additional refinement steps during search time. However, unlike our work, each node in a map is required to store an identifier that tells which swamp this node belongs to so that, during pathfinding, search for nodes inside swamps is blocked. Our approach also blocks search inside the blocked areas, however, without requiring nodes to store blocked areas' identifiers (which would be memory-inefficient in large maps). Instead, this information is retrieved from a memory-efficient binary search tree data structure during pathfinding.
Similar to A ⋆ and WA ⋆ , there are also other heuristic search algorithms in the literature that assume no prior knowledge about maps, such as the Explicit Estimation Search [17] and Jump Point Search [18] algorithms. Our work is complementary to those algorithms, i.e., pre-computed knowledge about blocked areas can be combined with those algorithms to reduce search time. As a proof of concept, this paper evaluates the impact of blocked areas' knowledge on reducing search time for the A ⋆ and WA ⋆ algorithms.
The Grid-Based Path Planning Competition (GPPC) [19] was introduced in 2012 to facilitate comparing different search algorithms using a standard set of maps, some of which were created artificially and others were taken from commercial video games [6]. We use sixty available maps of mazes and rooms from the same set to evaluate the proposed approach in this work. Details of the competing algorithms in the GCCP are summarized by [20].
A noticeable related field to our work is route planning in transportation networks, in which preprocessing techniques have been proposed to find the shortest paths in road networks [21]. Some of these techniques have also been used in the context of grid-based pathfinding. For example, in the GCCP'15 contest, a route planning approach based on the contraction hierarchies (CH) algorithm [22] achieved competitive performance. During a preprocessing phase, the CH algorithm uses a node contraction method to augment the shortest paths between each pair of nodes in a graph with shortcuts. During actual pathfinding, the search makes use of these shortcuts to reduce the execution time.

Background
In this paper, a grid map is represented as a 2D array of points (or nodes), where each node is identified by x and y coordinates. In addition, each node has a flag to indicate if it is either an obstacle or a non-obstacle node. For a given node n, adjacent(n) is the set of non-obstacle nodes that are reachable from n via a single movement, which can be vertical, horizontal or diagonal. Consider node m ∈ adjacent(n), the movement cost from n to m is represented by the function c(n,m): 1.0, n to m movement is either vertical or horizontal √ 2.0, n to m movement is diagonal Algorithm 1 shows the pseudo code for the classical A ⋆ algorithm, which finds the shortest path between a source node s and a goal node g in a grid map G. A ⋆ is a best-first search algorithm that gradually expands nodes along the way from s to g while prioritizing exploring node with better heuristic scores. When expanding a node q, the algorithm defines the following four values: gscore(q), which is the distance from the source node s to q; hscore(q), which is the algorithm's "guess" of the distance from q to the goal node g; f score(q) = gscore(q) + hscore(q), which represents the priority of q in the search; and parent(q), which is a pointer to the parent of q in the path from s to g. add q into open with gscore(q) = gscore ′ , f score(q) = gscore(q) + hscore(q), parent(q)= n else if gscore ′ < gscore(q) then update q in open with gscore(q) = gscore ′ , f score(q) = gscore(q) + hscore(q), parent(q)= n end if end for end while return f ailure The algorithm maintains two lists: open and closed. When expanding a node, it is put in the open list. Nodes in the open list are then iteratively explored by their increasing order of f score. A node that has been explored is removed from the open list and is put in the closed list so that it is not explored again. When the goal node is found, the search terminates and the path is constructed by recursively following the parent pointers from the goal node to the source node.
The optimality of A ⋆ is only guaranteed if the heuristic function is admissible, i.e., estimated distance by hscore(q) is always less than or equal to the actual distance from q to the goal node g [2]. One simple way of never overestimating hscore(q) is always assuming a straight line movement from q to g. An example of such a heuristic function is the octile-distance function, which is popularly used in 8-connected grid maps where horizontal, vertical and diagonal movements are allowed. Specifically, the octile-distance from node q to a goal node g is equal to the minimum number of diagonal steps, plus the minimum number of either vertical or horizontal steps needed to go from q to g. The octile-distance is given by the equation where ∆x is the number of horizontal steps and ∆y is the number of vertical steps from q to g. Note that minimum(∆x,∆y) represents the number of diagonal steps from q to g. WA ⋆ has the same pseudo code in Algorithm 1 with only one modification: it calculates f score(q) = gscore(q) + × hscore(q), > 1. As previously explained, this simple yet elegant modification adds a stronger greedy nature of the search algorithm that leads it to finding a path more quickly, albeit being a sub-optimal path. It is proven that the cost of the sub-optimal path found by WA ⋆ is bounded by × the cost of the optimal path [2].
The speed at which the A ⋆ algorithm, as well as the WA ⋆ algorithm, finds a path is affected by the quality of its heuristic function. For example, if the heuristic function computes an inaccurate estimate of the to-go-distance to the goal node, then the search algorithm will waste time exploring uninteresting nodes, i.e., nodes that are not pertaining to the shortest path. As previously mentioned, in many grid maps, a heuristic function may compute inaccurate distance estimates due to their unawareness of blocked ends created by the geometry of obstacles. To this end, this paper aims to identify areas in a grid map with blocked ends and make use of this knowledge to enable search algorithms to avoid wasting time exploring nodes in such areas.

Blocked Area Definition
For a given undirected 2D grid map, a blocked area (BA) is a connected subgraph of adjacent non-obstacle nodes that is bound by a continuous but non-enclosing chain of obstacle nodes. A blocked area's entrance is the imaginary straight line that connects the two end points of the obstacles' chain. Intuitively, any path that connects a node inside a blocked area with a node outside the blocked area passes by its entrance.
Given a blocked area A, we define entrance(A) to be the set of all nodes that lie on the imaginary straight line of A's entrance. As a result that nodes in a grid map are discrete, the imaginary line may not cross the nodes themselves, and instead, cross the squares that are formed by the nodes. Therefore, a more precise definition of entrance(A) is the set of all nodes that lie on the corners of the intersecting squares with the entrance's imaginary straight line. The remaining set of nodes inside the blocked area are defined as internal(A). Both of these sets are mutually exclusive, i.e., if node n ∈ internal(A), then n ∉ entrance(A), and vice versa. We also define external(A) to be the set of all nodes n such that n ∉ internal(A) and n ∉ entrance(A).
Given the aforementioned definition of a blocked area, all nodes in internal(A) and entrance(A) must be non-obstacle nodes. Otherwise, A is not a blocked area. Furthermore, due to not having obstacles inside or along the entrance of a blocked area A, the following two properties hold: Property 1 There is always a path between a node n 1 ∈ entrance(A) and a node n 2 ∈ internal(A). Property 2 There is always a shortest path between two nodes n 1 and n 2 ∈ entrance(A) such that this path does not pass by any node n ∈ internal(A).
The main claim of this paper is that blocked areas can be ignored during pathfinding without affecting the correctness and the optimality of a heuristic search algorithm. In below, Lemma 1 proves the correctness claim by showing that there is always an alternative path to any path that passes by a blocked area in a grid map. Lemma 2 proves the optimality claim by showing that there is an alternative path that is also optimal. Lemma 1. Consider a blocked area A and a pair of nodes s and g that are outside A. If there is a path between s and g that passes by an internal node inside A, then there is also another path between s and g that does not pass by an internal node inside A.
Proof. Let path(s,g) denote a path between s and g, where s, g ∈ external(A). Additionally, let us assume this path passes by a node n ∈ internal(A). In order to prove Lemma 1, we need to show that there is another path, i.e., path(s,g), such that n ∉ path(s,g).

Lemma 2.
Consider a blocked area A and a pair of nodes s and g that are outside A. If there is an optimal path between s and g that passes by an internal node inside A, then there is also another optimal path between s and g that does not pass by an internal node inside A.
Proof. Let path(s,g) denote a path between s and g (s, g ∈ external(A)) such that it passes by a node n ∈ internal(A). Additionally, let path(s,g) be an optimal path with cost c. From Lemma 1, there is at least one path between s and g, denoted path(s,g), such that n ∉ path(s,g). Letĉ be the cost of path(s,g). In order to prove Lemma 2, we need to show thatĉ = c.
As we showed earlier, we can write path(s,g) = path(s,x) + path(x,n) + path(n,y) + path(y,g) and path(s,g) = path(s,x) + path(x,y) + path(y,g), where nodes x and y ∈ entrance(A). Let c 1 , c 2 and c 3 be the cost of path(x,n), path(n,y) and path(x,y), respectively. Thus, First, because c is optimal, c ≤ĉ. Thus, c 1 + c 2 − c 3 ≤ 0. Second, due to Property 2, path(x,y) is optimal, i.e., it has a less or equal length to path(x,n) + path(n,y). Therefore, c 3 ≤ c 1 + c 2 , which leads to

Blocked Area Detection
In this paper, we consider grid maps where obstacle nodes are adjacent to each other such that they form vertical or horizontal line-segments. In such grid maps, blocked areas have polygon shapes with internal non-obstacle nodes and a non-enclosing perimeter of obstacle nodes. The open side of the blocked area's perimeter represents its entrance.
In a grid map, each line-segment obstacle is represented by two distinct nodes, i.e., e 1 = (x 1 ,y 1 ) and e 2 = (x 2 ,y 2 ), which are its end points. A horizontal line-segment obstacle has the same x-coordinate in both of its end points. A vertical line-segment obstacle has the same y-coordinate in both of its end points. We use the notation e 1 → e 2 to refer to a line-segment obstacle.
A vertical and a horizontal line-segment obstacles may intersect in a grid map. To identify such a scenario, we introduce a data structure called corner, which is represented by three distinct nodes v, t and h, where t is the intersection point, v is the end point of the vertical side and h is the end point of the horizontal side. We use the notation v → t → h to refer to a corner. Note that when a horizontal and a vertical line-segment obstacle intersect, up to four corners with different geometrical shapes can be generated. For example, the intersection has a single intersection point and four corners with the following shapes: ⌜, ⌞, ⌝ and ⌟. Figure 3 shows an example of a maze with 11 and 13 horizontal and vertical line-segment obstacles, respectively. Those obstacles intersect in 24 different intersection points such that 43 corners are generated. For example, the vertical line-segment (67,67)→(133,67) intersects with the horizontal line-segment (100,34)→(100,100). As a result, the four corners C 9 , C 10 , C 11 and C 12 are generated, where C 9 is represented by (67,67)→(100,67)→(100,34), C 10 is represented by (67,67)→(100,67)→(100,100), C 11 is represented by (133,67)→(100,67)→(100,34) and C 12 is represented by (133,67)→(100,67)→(100,100).
In a grid map, a polygon-shaped blocked area is formed when one or more corners are connected together such that a subset of non-obstacle nodes are bound within a continuous (but non-enclosing) chain of vertical and horizontal line-segment obstacles. For example, in Figure 3, blocked area A 7 is bound by the continuously connected corners (written in counterclockwise order): C 5 , C 6 , C 19 and C 18 . Similarly, blocked area A 15 is bound by the continuously connected corners: C 29 , C 31 , C 39 , C 38 and C 35 . An interesting case is A 1 , which is a triangular-shaped blocked area that was formed by the single corner C 7 .
A polygon-shaped blocked area is represented by its perimeter joint points, which can be extracted from its corners. For example, consider blocked area A 17 in Figure 3, which is formed by the corners C 25 , C 43 and C 42 (sorted in counterclockwise order). This blocked area is represented by the five points  In general, a polygon-shaped blocked area with s corners (sorted in counterclockwise order): C 1 , C 2 , . . . , C s can be represented by s + 2 joint points: e 1 , t 1 , t 2 , . . . , t s , e s , where t i is the intersection point of corner C i and e 1 and e s are the free points of the two corners C 1 and C s , respectively. The entrance is represented by the straight line between e 1 and e s .
A simple and key observation is the following: a corner can only be in one unique blocked area. In other words, different blocked areas in a grid map have disjoin sets of connected corners. Therefore, to identify blocked areas in a grid map, we propose the following approach. First, identify all corners that result from the intersections between vertical and horizontal line-segment obstacles in a grid map. Then, identify each disjoint subset of corners that belong to the same blocked area. Finally, extract joint points from these disjoint sets to represent each blocked area. Algorithm 2 presents the pseudo code for a preprocessing algorithm that performs the aforementioned approach. Firstly, in lines 1-2, Algorithm 2 uses the sweep line algorithm [23] to identify all intersection points between vertical and horizontal line-segments obstacles. The sweep line algorithm is a widely-used method for finding line-line intersections in Euclidean spaces due to its linearithmic performance [24]. Subsequently, Algorithm 2 extracts all corners from all intersections. Extracting corners from an intersection is straightforward (For brevity, pseudo codes of straightforward functions are not shown.). Output: List of all blocked areas in G

Algorithm 2 Blocked Area Detection Algorithm
cornersublist ⇐ subset of corners with intersection points in L 6: connectedPairs ⇐ PartitionAndSort(cornersublist,L) 7: for each pair of corners C i and C j in connectedPairs do 8: UF.union( C i , C j ) 9: end for 10: end for 11: BAlist ⇐ an initially empty list of blocked areas 12: for each disjoint set S in UF do 13: C 1 , C 2 , . . . , C s ⇐ get corners in S 14: sort C 1 , C 2 , . . . , C s counterclockwise 15: e 1 , t 1 , t 2 , . . . , t s , e s ⇐ extractJoints(C 1 , C 2 , . . . , C s ) 16: A ⇐ createBlockedArea(e 1 , t 1 , t 2 , . . . , t s , e s ) 17: add A to BAlist 18: end for 19: for each blocked area A in BAlist do 20: if entrance(A) has obstacle nodes or internal(A) has obstacle nodes then 21: remove A from BAlist 22: end if 23: end for 24: return BAlist Secondly, in lines 3-10, Algorithm 2 identifies which corners are connected. To do so, we propose the Partition and Sort method, which identifies all pairs of connected corners for every horizontal and vertical line-segment obstacle in a grid map. To explain, consider the horizontal line-segment obstacle (100,133)→(100,232), where the intersection points of the corners C 23 , C 30 , C 31 , C 32 , C 33 , C 39 and C 40 are located. Our method first partitions the corners into two sublists: upward corners and downward corners. Upward corners are the corners with their vertical side located upward from the horizontal line-segment obstacle, which include corners C 30 , C 31 and C 39 . Downward corners are the corners with their vertical side located downward from the horizontal line-segment obstacle, which include C 23 , C 32 , C 33 and C 40 . Next, our method sorts the corners in each sublist according to the y-coordinates of their vertical sides so that adjacent corners are next to each other. In the sorted order, each adjacent pair of corners with distinct intersection points are connected. For example, sorting the upward corners will give C 30 , C 31 and C 39 . The first adjacent pair (C 30 , C 31 ) are not connected because they have the same intersection point, while the second adjacent pair (C 31 , C 39 ) are connected due to having distinct intersection points. Similarly, sorting the downward corners will give C 23 , C 32 , C 33 and C 40 , where only two adjacent pairs are connected: (C 23 , C 32 ) and (C 33 , C 40 ). Algorithm 3 presents the pseudo code for our proposed Partition and Sort method. Similar to horizontal line-segment obstacles, the Partition and Sort method is applied to vertical line-segment obstacles but while partitioning to left and right (instead of up and down) and sorting by x-coordinates (instead of sorting by y-coordinates).

Algorithm 3 Partition and Sort method
Input: Set of l corners: C 1 , C 2 , . . . , C l with intersection points located on line-segment obstacle L.
Output: All pairs of connected corners on L 1: connectedPairs ⇐ an initially empty list 2: if L is horizontal then 3: upward ⇐ an initially empty list 4: downward ⇐ an initially empty list 5: x L ⇐ x-coordinate of L 6: for each corner C i in C 1 , C 2 , . . . , C l do 7: 9: add C i to upward 10: else 11: add C i to downward 12: end if 13: end for 14: sort corners in upward and downward by the y-coordinate of their intersection points 15: for each adjacent pairs of corners C i and C j in upward and dowward do 16: if C i and C j have distinct intersection points then 17: add (C i ,C j ) to connectedPairs 18: end if 19: end for 20: else if L is vertical then 21: le f t ⇐ an initially empty list 22: right ⇐ an initially empty list 23: y L ⇐ y-coordinate of L 24: for each corner C i in C 1 , C 2 , . . . , C l do 25: y i ⇐ y-coordinate of the horizontal end point of C i 26: if y i < y L then 27: add C i to le f t 28: else 29: add C i to right The Partition and Sort method identifies connected corners in pairs. However, a blocked area can have multiple connected corners. To find all connected corners, we propose using a union-find data structure, which is a commonly-used data structure for keeping track of connected components in graphs (Sedgewick and Wayne, 2011). A union-find data structure starts by initially assuming all corners are disjoint and put into separate sets. Then, every time a pair of corners are found to be connected by the Partition and Sort method, union-find data structure unions their sets. By doing so for all pairs of connected corners in a grid map, the union-find data structure will have all connected corners put in the same set. Furthermore, all sets in the union-find data structure are disjoint.
Thirdly, in lines 11-18, Algorithm 2 creates polygon-shaped blocked areas by extracting their perimeter's joint points from every disjoint set of corners in the union-find data structure.
Finally, in lines 19-23, Algorithm 2 removes all blocked areas that do not satisfy our definition in Section 4, which stated that all internal and entrance nodes must be non-obstacle nodes. For example, in Figure 3, C 22 and C 34 form a blocked area. However, this blocked area is discarded because its entrance intersects with the vertical line-segment obstacle (34,166)→ (133,166)), i.e., it has obstacle nodes in its entrance. Another example are corners C 16 , C 14 , C 1 and C 2 , which form a blocked area that was discarded because it has internal obstacle nodes.
The detailed description of how Algorithm 2 determines which blocked areas have obstacles in their entrance or internal sets is not shown but can be explained as follows. As a result that a blocked area's entrance is a straight line, the sweep line algorithm is used to determine if there is any horizontal or vertical line-segment obstacle that intersects with the entrance of a blocked area A. If so, A is discarded. Additionally, we modify the sweep line algorithm so that it can be used to identify all blocked areas with internal vertical or horizontal line-segment obstacles. Those blocked areas are also discarded.

Lemma 3.
The generated set of polygon-shaped blocked areas by Algorithm 2 satisfy the definition given in Section 4, i.e., each polygon-shaped blocked area is a connected subgraph of adjacent non-obstacle nodes that is bound by a continuous but non-enclosing chain of obstacle nodes.
Proof. As previously explained, in lines 1-18, Algorithm 2 identifies all non-enclosing polygon shapes of obstacles that represent blocked areas in a map. Then, in lines 19-23, Algorithm 2 determines which non-enclosing polygon shapes have obstacles in their entrance or internal sets and ensures that they are removed, i.e., not recognized as blocked area. Thus, Algorithm 2 guarantees that each polygon-shaped blocked area in the final output is a connected subgraph of adjacent non-obstacle nodes that are bound by a continuous but non-enclosing chain of vertical and horizontal line-segment obstacles (i.e., obstacle nodes).
We now present a complexity analysis of Algorithm 2's execution time. For convenience, let us assume, in a grid map, that T V is the number of vertical line-segment obstacles, T H is the number of horizontal line-segment obstacles, R is the number of intersections, N is the number of corners and P is the number of blocked areas. The following inequalities hold: The maximum number of possible intersections occurs when every vertical line-segment obstacle intersects with every horizontal line-segment obstacle. • R ≤ N ≤ 4 × R Each intersection generates anywhere between one to four corners. • P ≤ N The maximum number of blocked areas occurs when each corner, alone, forms a triangular-shaped blocked area. Table 1 describes an execution time complexity analysis for Algorithm 2. Authors in [24] have shown that, for a given Cartesian space with n line-segments and k intersections, the sweep line algorithm is bound by n log 2 n + k. This explains the upper bounds shown for lines 1 and 19-23 in Table 1. In addition, note that we use the weighted union-find data structure (Sedgewick and Wayne, 2011), which guarantees that union operations are executed in logarithmic-time. Therefore, in lines 4-10, the most expensive operation inside the loop is the Partition and Sort method. Specifically, let us assume that the number of corners in cornersublist is N l , the Partition and Sort needs linear time (i.e., N l ) to partition the corners and linearithmic time (i.e., N l log 2 N l ) to sort the corners. Due to having a loop, the overall execution of the Partition and Sort method is bound by N + N log 2 N. Finally, in lines 11-18, the most expensive operation in the loop is the sort operation in line 14. Hence, the overall loop's execution is bound by N log 2 N.
By taking into account the above three inequalities, the entire execution of Algorithm 2 is bound by (T V + T H + P) log 2 (T V + T H + P) + N log 2 N. This shows that the preprocessing time of Algorithm 2 has an efficient linearithmic growth.
By the end of its execution, Algorithm 2 will identify all polygon-shaped blocked areas in a grid map. Each polygon-shaped blocked area is represented by its perimeter joint points, i.e., each blocked area is stored in memory using pointers that point to the nodes located at these joints. Assuming J i is the number of joints in blocked area A i , the total number of extra pointers needed to store all blocked areas is J = P i=1 J i . In practice, even for large maps, J represents a small fraction compared to the total number of nodes in a map. For example, in all of our benchmark set in Section 7, J is less than 5.4% of the total number of nodes in the map. Table 1. Complexity analysis of Algorithm 2's execution time.

Lines
Upper Bound Complexity Explanation Initializing union-find with N corners 4 -10 N + N log 2 N The Partition and Sort method applied for a sub-list of corners inside a loop 11 -18 N log 2 N Sorting a sub-list of corners inside a loop 19 -23 (T V + T H + P) log 2 (T V + T H + P) + P Sweep Line Algorithm

BA-Tree
As previously mentioned, our approach uses pre-computed knowledge about blocked areas in a map to prohibit a search algorithm from unnecessarily exploring nodes inside blocked areas. This is achieved using the BA-tree, a binary tree data structure that stores blocked areas' information. During actual pathfinding, a search algorithm accesses the BA-tree to determine whether a particular node is inside a blocked area. If so, this node is discarded. In below, we first describe a preprocessing algorithm that constructs the BA-tree. Then, we describe an algorithm for accessing the BA-tree.

BA-Tree Construction
In computer science, spatial searching refers to the problem of locating objects in multi-dimensional spaces. R-tree data structures [25] have been extensively used for handling spatial searching in many contexts such as database applications and geographic information system applications [26]. The basic idea of R-tree data structures is to recursively subdivide a multi-dimensional space into subspaces such that nearby objects are grouped together into the same subspace. Those subspaces are then organized into a tree data structure, which in turn is used for servicing search queries. In this paper, we present the BA-tree, a variant of R-tree that is applied in the context of 2D grid-based pathfinding. Specifically, the BA-tree is a balanced binary search tree that is used for identifying which blocked area a given node belongs to in a 2D grid map.
For convenience, we first present the definition of the minimum bounding rectangle (or MBR) (The same terminology was used in the literature of R-tree data structures), which is the smallest rectangle that encapsulates a group of nearby blocked areas. Formally, we define MBR(A 1 , A 2 , . . . , A m ) to be the smallest rectangle that bounds a group of m blocked areas: A 1 , A 2 , . . . , A m . An MBR is represented by four coordinates: x upper , the coordinate of the upper-most row; x lower , the coordinate of the lower-most row; y le f t , the coordinate of the left-most column; y right , the coordinate of the right-most column. For example, in Figure 3, MBR(A 10 , A 11 , A 13 , A 15 ) is represented by x upper = 1, x lower = 100, y le f t = 133 and y right = 232.
Given a grid map with m blocked areas, the BA-tree is constructed as follows. Initially, an MBR that spans all m blocked areas is created. This MBR is then partitioned vertically into two MBRs such that each nearby m 2 blocked areas are put in the same MBR. Each MBR is then partitioned horizontally into two MBRs such that each nearby m 4 blocked areas are put in the same MBR. This is done recursively while alternating between vertical partitioning and horizontal partitioning. The recursive partitioning terminates when one or two blocked areas are reached. The resulting MBRs from the recursive partitioning are organized to form the following binary tree data structure: • Internal nodes represent MBRs while leaf nodes represent blocked areas. • At level 0 of the tree, there is only one node, the root node, which represents the MBR that encapsulates all m blocked areas. • At level 1 of the tree, there are two nodes, which represent the two MBRs generated from applying vertical partitioning to the MBR of the node at level 0. • At level 2 of the tree, there are four nodes, which represent the four MBRs generated from applying horizontal partitioning to the MBRs of the two nodes at level 1.

•
In general, at level l of the tree, there are 2 l nodes, which are generated from partitioning the 2 l−1 nodes at level l − 1 of the tree. This partitioning is vertical if l is odd, while it is horizontal if l is even.
Algorithm 4 presents pseudo code for the vertical and horizontal partitioning functions that construct the BA-tree for a grid map. To simplify partitioning, both functions use sorting (line 11) to ensure that nearby blocked areas are adjacent to each other. Both functions recursively call each other (lines [12][13] to alternate the partitioning process. As an example, Figure 4 shows the BA-tree constructed by Algorithm 4 for the maze in Figure 3. Note that MBRs in different internal nodes may overlap. In below, we show few key properties of the BA-tree.  Rectangle shapes are used for internal nodes to highlight the fact that they represent MBRs. In addition, each rectangle have four numbers shown on each one of its sides to show its xupper, x lower , y le f t and y right coordinates. Each leaf node may have only one or two blocked areas.

Lemma 4.
Assuming m is the number of blocked areas in a grid map, the height of the BA-tree is ⌈log 2 m⌉ − 1.
Proof. In a binary tree, the height is equal to the maximum level of a node in the BA-tree. Therefore, our goal is to show that all nodes in the BA-tree are located at levels ≤ ⌈log 2 m⌉ − 1. Without loss of generality, let us first assume the simple case when m is a power of 2, i.e., m = 2 d , where d is some integer. In this case, the levels of the BA-tree are: 0, 1, . . . , d − 1, where d − 1 is the last level because the recursive partitioning terminates when number of blocked areas is 2 (line 1 in Algorithm 4). Thus, the height of the tree is d − 1 = log 2 m − 1.
In the general case, let h be the height of the BA-tree. Furthermore, let d be an integer such that 2 d−1 < m ≤ 2 d . First, h ≤ d − 1 because m ≤ 2 d and d − 1 is the height of a tree with 2 d nodes (as shown by the simple case). Second, h > d − 2 because m > 2 d−1 and d − 2 is the height of a tree with 2 d−2 nodes. Combining both inequalities, we can write d − 2 < h ≤ d − 1. As a result that h is an integer, the only possible solution is h = d − 1. As a result that d = ⌈log 2 m⌉ (generated from applying the logarithm function to the inequality 2 d−1 < m ≤ 2 d ), then h = ⌈log 2 m⌉ − 1. Proof. Let m be the number of blocked areas in a grid map and h be the height of the BA-tree. To show that the BA-tree is balanced, we need to show that each leaf node in the BA-tree is located at either level h − 1 or level h.
Without loss of generality, let us first assume the simple case when m is a power of 2, i.e., m = 2 d , where d is some integer. In this case, it is easy to prove that the BA-tree is balanced because Algorithm 4 always partitions an MBR of an internal node e such that the number of blocked areas in the left subtree of e is equal to the number of blocked areas in the right subtree of e (lines [12][13]. Furthermore, in this case, all leaf nodes are located exactly at level d − 1 (because m is a power of 2 and the recursive partitioning terminates when the number of blocked areas is 2).
In the general case, let d be an integer such that 2 d−1 < m ≤ 2 d . As a result of m > 2 d−1 , there are no leaf nodes that can be located at a level that is smaller than d − 2 (as shown by the simple case). Furthermore, because m ≤ 2 d , the maximum level in the BA-tree is d − 1 (Lemma 4). Thus, all leaf nodes can only be located at levels d − 2 and d − 1. As a result of h = d − 1 (Lemma 4), we can also infer that all leaf nodes are located at levels h − 1 and h.

Lemma 6.
Assuming m is the number of blocked areas in a grid map, the number of nodes in the BA-tree is bound by 2 × m − 1.
Proof. As a result that the number of nodes in level i is at most 2 i , the maximum number of nodes in the BA-tree is bound by This is a geometric series that is equal to 2 ⌈log 2 m⌉ − 1. ⌈log 2 m⌉ < log 2 m + 1, 2 ⌈log 2 m⌉ < 2 log 2 m+1 = 2 × m. Thus, 2 ⌈log 2 m⌉ − 1 < 2 × m − 1.

Corollary 1.
Assuming m is the number of blocked areas in a grid map, the number of internal nodes in the BA-tree is bound by m − 1 and the number of leaf nodes is bound by m.

BA-Tree Analysis
We now present a complexity analysis of the execution time of Algorithm 4. Without loss of generality, let us assume the number of blocked areas m is a power of 2 and the height of the BA-tree is log 2 m. In Algorithm 4, it is quite straightforward to observe that the sort operation in line 11 is the longest operation, i.e., execution time is bound by sorting time (which has the upper bound complexity of n log 2 n, where n is the number of integers). At each level j in the BA-tree, there are 2 j nodes, i.e., subproblems. Each subproblem's execution time is bound by sorting m 2 j blocked areas. Therefore, the amount of work done by all nodes in level j is 2 j × m 2 j log 2 m 2 j = m log 2 m 2 j ≤ m log 2 m. As a result that the height of the tree is log 2 m − 1 (Lemma 4), the amount of work done to construct all levels in the BA-tree is bound by m log 2 2 m. In general, the execution time is bound by m⌈log 2 m⌉ 2 . We now present a memory consumption analysis for the BA-tree. As shown by Corollary 1, the number of internal nodes is bound by m − 1, where m is the number of blocked areas. Each internal node stores four integers (the coordinates of its MBR) and two pointers (le f t and right). Assuming pointers and integers require the same amount of memory, all internal nodes need to store no more than 6 × (m − 1) integers. In other words, the needed memory for internal nodes in the BA-tree is bound by the number of blocked areas in a grid map. On the other hand, leaf nodes store the blocked areas: A 1 , A 2 , . . . , A m . As previously mentioned in Section 5, the amount of memory needed to store blocked area's information, denoted by J, is bound by the total number of joint points in blocked areas, which is, in practice, equal to a small fraction of the number of nodes in a grid map.

BA-Tree Access
Algorithm 5 presents pseudo code for a recursive search function that identifies which blocked area A in the BA-tree contains a particular node q in a grid map. Starting from the root node, the search function searches all the nodes in the BA-tree in a depth-first fashion, however, with a key optimization: when visiting an internal node e, if q ∉ MBR(e), then the search for the descendent nodes of e is terminated (lines [9][10][11]. This is because, if q ∉ MBR(e), it is predetermined that e ∉ any of the descendent blocked areas of e. In the case q ∈ MBR(e), the function continues the search in the left path of e (line 12). If a blocked area was not found in the left path (line 13), the function then searches the right path of e (line 14). Determining whether q ∈ MBR(e) or not is straightforward: q ∈ MBR(e) if and only if x upper ≤ q.x ≤ x lower and y le f t ≤ q.y ≤ y right . In the case e is a leaf node (lines 1-8), the search function simply checks if q is contained by any of the blocked areas in e and terminates the search if such blocked area is found. Note that the maximum number of blocked areas in a leaf node is two. To determine if a polygon-shaped blocked area A contains a node q, we use the winding number algorithm [27], a widely-used method in computational geometry for determining if a point is inside a polygon. Figure 5 shows examples on two search queries for the BA-tree in Figure 4.  The worst-case scenario in Algorithm 5 occurs when a search function visits all the nodes in the BA-tree, which is bound by 2 × m − 1 (Lemma 6), where m is the number of blocked areas. However, this is rarely needed because, in real maps, blocked areas are often scattered such that they are isolated from each other. Thus, in the common case, the search function only needs to check a subset of paths in the BA-tree. As a result that the height of the BA-tree is ⌈log 2 m⌉ − 1 (Lemma 4), the average-case execution time of Algorithm 5 is τ × ⌈log 2 m⌉, where τ is some constant.

BA-Tree Alternatives
We now describe two alternative schemes to store blocked areas' knowledge and discuss their differences to the BA-tree. The first alternative scheme is to assign a unique numeric ID to each blocked area, and then use an extra grid, in which each node stores the ID of the corresponding blocked area, or an invalid ID if it is outside all blocked areas. Such a scheme would require O(1) access time and O(n) space, where n is the total number of nodes in a grid map. By contrast, the BA-tree is more memory-efficient because it stores information about only joint points, which represents a small fraction of the total number of nodes in a gird map. Furthermore, the BA-tree has a low access time overhead, as discussed earlier.
Another scheme is to use trapezoidal decomposition [28], i.e., divide blocked areas into a set of trapezoids such that each trapezoid is the portion of the sweep line between two adjacent corners. Trapezoids are neighbors if they are neighbors along the sweep line or if one appears when the other disappears at a sweep event. The number of trapezoids is bound by 3 ×n, where n is the total number of line-segment obstacles in a grid map [28]. Identifying which trapezoid contains a particular node is a grid map is O(1). More specifically, the first node requires O(n) due to performing linear search. Afterward, identifying adjacent nodes requires constant time because they are found in adjacent trapezoids. The trapezoidal decomposition scheme provides better access time guarantees than the BA-tree. However, it requires more memory to store the trapezoids' information.

Experimental Results
We evaluate the performance by showing the reduction in execution time for both A ⋆ and WA ⋆ algorithms after combining the knowledge of blocked areas in their search. We perform pathfinding for sixty maps of mazes and rooms taken from the public pathfinding benchmarks library [6]. The modified algorithms are denoted by A ⋆ + BA and WA ⋆ + BA. All four search algorithms: A ⋆ , WA ⋆ , A ⋆ + BA and WA ⋆ + BA are implemented and compiled using Java SE 8 and all experiments were executed on a Red Hat Enterprise Linux 6 machine with a 2.2 GHz Intel Xeon-E5 processor and a 64 GB DDR3 memory with a speed of 1333 MHz.
Below, we describe, in detail, the implementation of the modified search algorithms and the tested benchmark set. Then, we present the evaluation results.

Search Implementation
Algorithm 6 shows the pseudo code for the implementation of A ⋆ + BA algorithm. Compared to the standard implementation of A ⋆ (shown in Algorithm 1), Algorithm 6 has two modifications. First, before adding node q into the openlist, the algorithm accesses the BA-tree (in line 14) using the search function presented in Algorithm 5 to determine the blocked area A where q is located. Second, in line 15-17, the algorithm inserts q into the openlist (i.e., included in the search) only if one of the following three conditions are satisfied:

1.
q is not inside a blocked area.

2.
q is inside a blocked area; however, this blocked area also contains the goal node g. This condition ensures that the search never ignores a blocked area where the goal node is located.

3.
q is inside a blocked area; however, this blocked area also contains the parent node of q. This condition is needed in the case that the source node s happens to be inside a blocked area. If so, this blocked area is included in the search.
The aforementioned conditions are checked in the order shown, i.e., condition 2 is only checked if condition 1 is not satisfied and condition 3 is only checked if both conditions 1 and 2 are not satisfied. If none of the three conditions is satisfied, then q is not inserted into the open list and therefore discarded from the search space. WA ⋆ and WA ⋆ + BA have the same implementations as A ⋆ and A ⋆ + BA, respectively, except that they calculate f score(q) = gscore(q) + × hscore(q), where > 1 is a real number that controls the inflation of the heuristic function hscore(q). In this paper, we set = 3.0. if n = g then 8: return path from s to g 9: end if 10: add n into closed 11: for each node q ∈ adjacent(n) and q ∉ closed do 12: gscore ′ ⇐ gscore(n) + c(q,n) 13: if q ∉ open then 14: A ⇐ searchBAtree (q, BAroot) 15: if A = null or g ∈ A or A = searchBAtree (parent(q), BAroot) then 16: add q into open with gscore(q) = gscore ′ , f score(q) = gscore(q) + hscore(q), parent(q)= n 17: end if 18: else if gscore ′ < gscore(q) then 19: update q in open with gscore(q) = gscore ′ , f score(q) = gscore(q) + hscore(q), parent(q)= n 20: end if 21: end for 22: end while 23: return f ailure An important note is that a tie may occur when extracting the node with the minimum f -score in the open list, i.e., there might be multiple nodes that have the same minimum f -score (line 6). In such a case, ties are broken in favor of the node with the largest g-score. Such tie-breaking strategy is common in the literature [29,30]. We use this strategy in the implementation of all four tested algorithms.
In all four algorithms, the open list is implemented using a binary min heap data structure, while the closed list is implemented using a hash table data structure that uses chaining to resolve collisions.

Benchmark Set
Thirty grid maps of mazes and thirty grid maps of rooms were selected form the public pathfinding benchmarks library to evaluate performance in this paper. A maze's map consists of corridors with fixed sizes that are randomly scattered. A room's map consists of squares with fixed sizes that are uniformly distributed with randomly generated doors between every two adjacent rooms (squares). All sixty maps of mazes and rooms have 512 × 512 resolution, i.e., the number of nodes in each row and in each column is 512.
The thirty mazes are divided into three types, each of which has ten maps, which are: maze-8, maze-16 and maze-32, where 8, 16 and 32 are the sizes of the corridors in each type, respectively. Similarly, the thirty rooms are divided into three types, each of which has ten maps, which are: room-8, room-16 and room-32, where 8, 16 and 32 are the sizes of the squares in each type, respectively. The pathfinding benchmarks library also provides, for each grid map, hundreds of test cases with randomly generated source and goal points (called scenarios) for performing pathfinding. We use all of these scenarios in our evaluation (however, we discard invalid scenarios where either the source or the goal node is an obstacle). Table 2 summarizes the evaluated benchmarks set. In all sixty maps, obstacles are represented as vertical and horizontal line-segments. Therefore, in Table 2, we also include information about the number of horizontal and vertical line-segment obstacles, as well as the number of intersections and corners in every map. All of these measurements are relevant to the blocked area detection algorithm presented in Section 5. Table 2. The benchmark set used for performance evaluation in this paper. S is the number of scenarios available from the pathfinding benchmarks library. B% is the percentage of obstacle nodes in the map. H is the number of horizontal line-segment obstacles. V is the number of vertical line-segment obstacles. R is the number of intersections between horizontal and vertical line-segment obstacles. N is the number of corners generated from these intersections.
some of the identified triangular-shaped blocked areas are small, i.e., they have one side with short length. As an optimization, such blocked areas were discarded because they are not useful during actual pathfinding. In all maps, preprocessing time was less than 300 milliseconds. Table 3 presents an evaluation of the preprocessing step by showing the following measurements: (i) the number of blocked areas (BA); (ii) the percentage of nodes covered by those blocked areas (insideBA%); (iii) the percentage of nodes stored in memory due to blocked areas (joints%); (iv) the size of the BA-tree, i.e., the total number of tree nodes used to construct the BA-tree (BASize); and (v) the worst-case number of searched nodes when accessing the BA-tree (Search).
In a grid map, the percentage of nodes covered by blocked areas represents an upper bound on the number of nodes that can be eliminated during pathfinding. On average, the percentages of covered nodes in maze-8, maze-16 and maze-32 are 35%, 37% and 38%, respectively. On average, the percentages of covered nodes in room-8, room-16 and room-32 are 25%, 41% and 52%, respectively.
As was previously mentioned in Section 5, blocked areas are stored in memory using pointers that point to nodes located at the joints of blocked areas. Table 3 shows that, in the worst-case, the total number of joints in blocked areas is less than 1.3% of the total number of nodes in mazes and is less than 5.4% in rooms. This demonstrates the memory efficiency of our approach.
While accessing the BA-tree is not part of preprocessing, in Table 3, we also measure the worst-case scenario of accessing the BA-tree. Specifically, we use the search function in Algorithm 5 to access the BA-tree using every node in a grid map as a search query, and then we report in Table 3 the worst-case number of how many nodes in the BA-tree were searched. In all sixty grid maps, the worst-case search needed was no more than 6 × log 2 n nodes in the BA-tree, where n is the total number of nodes in the BA-tree. This shows that, in practice, the access time of the BA-tree is logarithmic. Table 3. Preprocessing evaluation for all sixty benchmarks in Table 2. BA is the number of blocked areas. insideBA% is the percentage of nodes covered by blocked areas. joints% is the percentage of all nodes stored in blocked areas. BASize is the total number of nodes in the BA-tree. Search is the worst-case number of searched nodes when accessing the BA-tree using Algorithm 5.

Pathfinding Evaluation
We demonstrate the impact of the blocked areas' knowledge in pathfinding using two comparisons: (i) A ⋆ + BA versus A ⋆ ; and (ii) WA ⋆ + BA versus WA ⋆ . We do so by computing the relative execution time, which is a intuitive measurement of the reduction in execution time. For example, let us assume that the execution time of A ⋆ when performing pathfinding for a particular map is 500 milliseconds, while the execution time for A ⋆ + BA when performing the same pathfinding is 300 milliseconds. In this case, the relative execution time is 300 500 = 0.6, which demonstrates that the execution time of A ⋆ was reduced by 40% when using the blocked areas' knowledge in its pathfinding. We also evaluate the reduction in search space, i.e., the reduction in the number of visited nodes during pathfinding, by computing the relative search space.
As shown by Table 2, our benchmarks set consists of six types of maps, each of which has ten instances. Furthermore, each map instance has hundreds of scenarios. Therefore, we measure performance for each one of the six map types by computing the average relative execution time and the average relative search space for all ten instances combined. Specifically, we use the following formula: where S i is the number of scenarios in map instance i; (A ⋆ + BA) i,j is the execution time when performing pathfinding using A ⋆ + BA for scenario j in map instance i; and (A ⋆ ) i,j is the execution time when performing pathfinding using A ⋆ for scenario j in map instance i. A similar formula is used for computing the relative search space. Furthermore, the average relative execution time and search space of WA ⋆ + BA over WA ⋆ are computed in the same manner. Table 4 shows the average relative execution time and search space for the two aforementioned comparisons. In all three types of mazes, on average, the search spaces of both A ⋆ and WA ⋆ were reduced by 33%, which translated into a reduction in execution time by 28%, on average. In room maps, the search spaces of A ⋆ were reduced by 22%, 36% and 45%, on average, for room-8, room-16 and room-32, respectively. As a result, the execution times were reduced by 15%, 33% and 44%, respectively. In the case of WA ⋆ , on average, the search spaces for room-8, room-16 and room-32 were reduced by 18%, 34% and 47%, respectively. This caused the execution time to be reduced by 16%, 30% and 43%, respectively. The performance results in Table 4 can be explained by insideBA% in Table 3, which is the percentage of how many nodes are covered by blocked areas in a map. For example, all three types of mazes have approximately the same coverage percentage (around 37%, on average). Thus, they have a similar performance in Table 4. On the other hand, room maps have different insideBA% values, which explains their varying performance. For example, room-32, where the coverage percentage is the highest (52%, on average), the execution time was reduced by 43%, on average. However, in the case of room-8, where the coverage is the lowest (25%, on average), the execution time was reduced by 16%, on average. Appendix A shows absolute execution times for all maps.
The performance results in Table 4 demonstrates that using blocked areas' knowledge during pathfinding has significant impact on reducing search time in both A ⋆ and WA ⋆ . However, Table 4 hides some of the interesting details about how the benefit of blocked areas' knowledge compares between short-distance pathfinding versus long-distance pathfinding. For example, in mazes, there are thousands of pathfinding scenarios available from the pathfinding benchmarks library, in which some scenarios have short paths (i.e., path cost is less than 100), whereas some scenarios have long paths with cost up to 2000. Figure 6 demonstrates the impact of blocked areas' knowledge on different path costs in mazes. Specifically, in Figure 6, we divide all scenarios into ten groups, where scenarios in the first group have path costs from 0 to 200, scenarios in the second group have path costs from 200 to 400, scenarios in the third group have path costs from 400 to 600 and so on. Similarly, Figure 7 depicts the impact of blocked areas' knowledge on different path costs in rooms. Note that, unlike mazes, scenarios in room maps have path costs that are between 0 and 800. Therefore, these scenarios are divided into ten groups, where scenarios in the first group have path costs between 0 and 80, scenarios in the second group have path costs between 80 and 160, scenarios in the third group have path costs between 160 and 240 and so on.   Figure 6 shows that, in all three types of mazes, blocked area's knowledge has significant but different performance impacts on both short-distance and long-distance pathfinding. Specifically, in A ⋆ , the execution time is reduced by 10% for low path costs, and then this reduction steadily improves as the path cost increases, to reach around 35%, for high path costs. In WA ⋆ , the reduction in execution time varies from 18% in low path costs to 34% in high path costs. Figure 7 shows that, in all three types of rooms, the impact of blocked area's knowledge on performance is mostly significant in both short-distance and long-distance pathfinding. However, this impact differs across different room types. In A ⋆ , in the case of room-32, the reduction in execution time starts from a moderate 3%, and then sharply improves as the path cost increases to reach 57% for paths with high costs. In room-16, the reduction in execution time starts from 3% in low path costs and quickly increases to reach 43% in high path costs. In room-8, the reduction in execution time gradually increases from 5% in low path costs to 19% in high path costs. In WA ⋆ , the performance behavior is less deterministic, i.e., in some cases the reduction in execution time decreases as the path cost increases. However, this nondeterminism varies in degree between different types of rooms. In room-32, the reduction in execution time starts form 10% for low path costs and then sharply improves to reach 58% in high path costs (with the exception of one case). In room-16, the reduction in execution time varies between 23% and 36% while having less consistent behavior. In room-8, the reduction in execution time varies between 6% and 30% while also having no consistent trend in performance.
We summarize the results of our evaluation of pathfinding in mazes and rooms as follows. In all three types of mazes, in general, the execution times of both A ⋆ and WA ⋆ were reduced by 10-35% (the average reduction is 28%). In all three types of rooms, in general, the execution times of both A ⋆ and WA ⋆ were reduced by 5-57% (the average reduction is 30%). Unlike mazes, different room map types have significantly different degrees of how many nodes are covered by blocked areas, which led to having different performance behaviors.
Finally, it is worthwhile to mention that eliminating nodes inside blocked areas during pathfinding often led A ⋆ + BA to find a different path from the one found by A ⋆ . However, in all maps and in all scenarios, both paths had the same optimal cost (as was also proven in Section 4). In the case of WA ⋆ + BA and WA ⋆ , both algorithms found sub-optimal paths. In most cases, the two paths obtained by both algorithms have the same cost. However, interestingly, in some cases, we observed that WA ⋆ + BA found paths with lower costs than WA ⋆ . This is because the exploration of blocked areas led WA ⋆ in some cases to find a longer path. On average, WA ⋆ + BA reduces the path cost of WA ⋆ by 2%.

Conclusions
This paper introduced the concept of blocked areas, which are sub-regions in grid maps where there is no viable path due to obstacles. In the context of grid-based optimal or sub-optimal pathfinding, the presence of blocked areas causes heuristic search algorithms to frequently explore irrelevant paths, significantly increasing search time in the process. To decrease search time, this paper presented a preprocessing approach that uses computational geometry techniques to identify blocked areas with polygon shapes in a grid map and store information about them into a memory-efficient balanced binary search tree. During actual pathfinding, the stored knowledge about blocked areas is used to avoid exploring paths inside blocked areas, which in turn reduces search time.
We evaluated the performance by comparing the execution times of A ⋆ and WA ⋆ before and after using blocked areas' knowledge in pathfinding for a publicly available benchmark set that includes sixty maps of mazes and rooms. Our experimental results have shown that the execution times for both A ⋆ and WA ⋆ have been substantially decreased while preserving the optimality of A ⋆ and the sub-optimality of WA ⋆ . This is achieved for both short-distance and long-distance pathfinding. Furthermore, we calculated the worst-case bounds for the memory needed to store blocked areas' information during preprocessing and the access time needed to retrieve this information during pathfinding and showed that those bounds are efficient.
Utilizing blocked areas' knowledge during pathfinding is applicable beyond the A ⋆ and WA ⋆ algorithms. In future work, we will study the impact of combining blocked areas' knowledge with other search algorithms in the literature. Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: