A Multilevel Road Alignment Model for Spatial-Query-By-Sketch

: A sketch map represents an individual’s perception of a speciﬁc location. However, the information in sketch maps is often distorted and incomplete. Nevertheless, the main roads of a given location often exhibit considerable similarities between the sketch maps and metric maps. In this work, a shape-based approach was outlined to align roads in the sketch maps and metric maps. Speciﬁcally, the shapes of main roads were compared and analyzed quantitatively and qualitatively in three levels pertaining to an individual road, composite road, and road scene. An experiment was performed in which for eight out of nine maps sketched by our participants, accurate road maps could be obtained automatically taking as input the sketch and the metric map. The experimental results indicate that accurate matches can be obtained when the proposed road alignment approach Shape-based Spatial-Query-by-Sketch (SSQbS) is applied to incomplete or distorted roads present in sketch maps and even to roads with an inconsistent spatial relationship with the roads in the metric maps. Moreover, highly similar matches can be obtained for sketches involving fewer roads.


Introduction
With the widespread availability of the internet, people can create, publish, or query spatial data through web-based geographic information systems (GIS), such as volunteered geographic information (VGI) [1]. Geographic information systems can store, analyze, and visualize a variety of spatial information. They help for resource management, navigation, decision making, etc. Spatial query is one of the commonly used means of manipulating geographic information for individual citizens. It refers to query based on object's name, location, distance, etc. However, existing spatial query methods for general users are usually based on single attributes, such as the feature's name-finding a building named "Hotel", or distance between features-finding the nearest hotel. However, if you want to find a hotel with amenities such as restaurants, shopping malls, etc. in its vicinity, a spatial combination of multiple attributes (distance, name, etc.) is required. However, existing geographic tools for common users cannot perform this query, such as Google Maps. Professional GIS tools, such as QGIS (https://qgis.org/en/site/), implement this query by first creating the hotel's buffer, and then calculating the intersection of that buffer with the restaurants, shopping malls, etc. This process is obscure for the average users.
The key to solving this problem is whether it is possible to find ways of human-computer interaction that are close to human's daily communication methods and can be understood by GIS systems, so that humans can communicate with computers in the same way as their counterparts [2].
A sketch map outlines a place/area/region in a drawing where the mappers represent the more characteristic elements according to their point of view. In 1948, Tolman categorized sketch maps as cognitive maps [3]. Sketch maps often reflect a strong correspondence between the spatial relations in the map and the environment and the interaction interfaces between people and their environment [4]. Individuals can create sketch maps simply by drawing an object on a paper or on a touch-sensitive screen by using a drawing software. Spatial-query-by-sketch (SQbS) [5] aligns the objects in a sketch map with those from different spatial databases. It translates a sketch map into a symbolic representation and processes it against a geographic database [6]. For example, the SketchMapia framework [7][8][9] can be used to identify seven invariant sketch features to find out a correspondence between the sketch maps and metric maps. In this framework, qualitative constraint networks based on existing qualitative calculi are compared to match the spatial objects and a new structure involving the local compatibility matrices is used to ensure partial matching and high accuracy. SQbS is a convenient tool for regular individuals owing to the following advantages: (i) individuals can conduct a spatial query without any particular background technical knowledge; (ii) in contrast to traditional query methods, which focus on a single attribute query, the approach supports scene queries; and (iii) in contrast to text queries, this approach is a more intuitive and natural query method that displays all the features. SQbS have the potential application in navigation [10], 3D shape retrieval [11], facilitating crime scene identification [12], etc.
However, according to Blaser [13], only a small quantity of objects (12)(13)(14)(15)(16)(17) appears in a sketch, where human-built objects (e.g., roads and buildings) are often given more relevance than natural objects (e.g., green spaces). In addition, in Davies and Peebles [14]'s study, aspects that are visibly, semantically, or emotionally significant for them are usually given more priority, and as Meilinger et al. [15] pointed out, the aspects that mappers consider "uninteresting" are usually simplified. The incomplete, distorted, and schematized features drawn in a sketch map are also challenges for SQbS, because the data quality can significantly affect the computation results. For example, Bindzárová Gergelóvá et al. [16] found that the given hydrodynamic modeling process is sensitive to the changes of the qualitative aspect of the input data.Ślusarski and Jurkiewicz [17] developed cartographic visualization techniques to visualizing data uncertainty in the Database of Topographic Objects (DTO). In their work, three types of uncertainty (positional, attribute, and temporal) were presented using expert know-how and experiences. Hátlová and Hanus [18] reviewed 90 empirical studies published since 1960 to identify the factors which can influence the quality of the sketch map. They found that among a lot of influence factors, some overlooked factors, such as source map characteristics and geographical education, are also of importance to the quality of sketch map.
So, how to achieve the accurate alignment of the sketch map and other data sources is the focus of existing SQbS studies. Anyhow, most of the existing studies focus on the exact or constraint relaxation matching [19] of the objects' spatial relationships. For example, Egenhofer [6] advocated SQbS based on 9-intersection model for the first time. Then Kurata and Egenhofer [20] and Kurata [21] further demonstrated 9 + -intersection relationships between directed line segments and between directed line segments and regions. Nedas et al. [22] extended the 9-intersection model by capturing metric details for line-line relations through splitting ratios and closeness measures. Lewis and Egenhofer [23] examined the topological relationships through a description of the boundary intersections between sets of objects. This approach preserved the ordering and qualitative length of these intersections. In particular, quantitative characteristics reflect individuals' direct cognition of places and play a key role in realizing a SQbS.
Previous studies by Tang et al. [24] reported that the spatial relationships among objects, including topological, order, and location relationships, are not completely consistent between sketch maps and metric maps owing to the presence of incorrectly sketched objects or those sketched in a distorted manner. Nevertheless, main roads, as the principal routes of regional traffic, exhibit a high similarity between sketch maps and metric maps, likely because people tend to draw main roads more frequently and accurately than the rest of roads in a sketch map. This aspect can be attributed to the fact that roads and paths connect places and are frequently used for navigation. This finding is consistent with the visual judgement of individuals, specifically, two places are deemed to be similar not only because they have similar spatial relationships but also because they have similar structural characteristics, such as shapes. The roads produced by the participants in the study by Tang et al. [24] were drawn at a planning conceptualization level according to Timpf et al. [25]. The instructional or driver road conceptualizations did not appear maybe because participants were not asked to draw roads, but a place, so detail about road lanes, exits, or entrances was missing. The main goal of this paper is to enhance the correspondence between sketch maps and metric maps. For that, a novel matching approach termed Shape-based Spatial-Query-by-Sketch (SSQbS) is presented, which is based on a SQbS taking into account the road shape. A three-level road alignment model including individual roads, composite roads, and road scenes is developed. This approach is suitable for application to incomplete or distorted roads along with roads with an inconsistent topological relationship matching between the sketched maps and the metric maps.
The remaining paper is structured as follows. Section 2 provides a review of the related work. Section 3 presents the sketched road characteristics in the three considered levels and describes how the main roads are extracted from a sketch. In addition, the descriptors and matching approaches for individual roads, composite roads, and road scenes are also explained in this section, respectively. Section 4 describes the conducted experiment and presents and discusses the matching results. The conclusions and scope for future work are presented in Section 5.

Related Work
Shape-based query-by-sketch techniques have been widely used for image retrieval. Gottfried [26] asserted that the shape of a sketched object is useful to identify other images containing similarly shaped objects by considering their spatial structures. For that, Gottfried [27,28,29] proposed a global feature scheme based on a qualitative representation in which tripartite line tracks were used to describe polygons and bipartite arrangements were used to describe the relation between two disconnected objects. Bai et al. [30] applied the discrete curve evolution (DCE, [31]) to decompose complete contours and created the MPEG-7 (Moving Picture Experts Group) database of contour segments. Their representation of object parts was invariant to scaling, rotation, and translation; moreover, smaller misalignments in the contours have a robust shape context that allows shape matching. Falomir et al. [32] presented a Qualitative Shape Descriptor (QSD) to describe the contour of an object shape extracted from any digital image and also a computational approach to compare the QSDs obtained in order to identify shape correspondences. Cao et al. [33] built the multimodal MindFinder system, which can retrieve sketches from millions of images. Their system allowed users to outline the main shapes by tagging and colouring. Xiao et al. [34] improved the MindFinder framework using a shape word descriptors for sketch-based image retrieval and they employed the classical Chamfer Matching Algorithm by [35] to address the shape word matching problems. Furthermore, Xiao et al. [34] proposed the use of an inverted index structure to extend the shape word expression to a wide range of image databases.
In addition, shape-based road alignment has been applied for vector road network matching. Zhang [36] adopted a turning function as the descriptor of the road shapes. Touya et al. [37] proposed a framework based on the least squares adjustment [38,39] and position constraints to match object geographic shapes. Tong et al. [40] formulated a logistic regression matching method named OILRM (Optimization and Iterative Logistic Regression Matching), which is used to determine the distance between two lines based on their vertices for road matching. Kim et al. [41] compared the angles and directions of linear features and obtained high accuracy results by further comparing the topological relationships between the matched features. To carry out shape comparison in images, Ali Abbaspour et al. [42] studied three functions (turning, signature, and tangent functions) and three shape descriptors (shape context, LORD, and shape signature). In addition, Ali Abbaspour et al. [42] demonstrated that the turning function can be used to efficiently distinguish objects in terms of their shapes. In particular, the existing studies on shape-based query-by-sketch for image retrieval [26][27][28][29][30][31]33,34] focused on a small number of objects for comparison, whereas a sketch map may contain typically 12-17 objects [13]. Furthermore, the spatial relationships among objects also must be aligned to carry out Spatial-Query-by-Sketch. In the approach presented in this paper, the main roads from sketch maps are extracted, because they have been demonstrated to exhibit considerable similarities between sketch maps and metric maps [24]. In addition, the characteristics of the main roads are further analyzed and compared quantitatively and qualitatively at three levels: individual road, composite road, and road scene. Consequently, the approach presented here is different from those reported by Zhang [36] and Tong et al. [40], in which only quantitative characteristics (e.g., turning functions or distances between vortices) were adopted for road matching. The proposed approach differs from the work by Kim et al. [41] as we surveyed more than one spatial relationship (topological relationship). Moreover, owing to the particularity of sketch maps, the approach presented in this paper varies from the traditional vector road data query (e.g., Touya et al. [37], Ali Abbaspour et al. [42]) as it overcomes the following aspects: (1) inaccuracy of drawn objects: in sketch maps, roads are often distorted, partially drawn, or missing local details; (2) omitted objects: the number of objects in a sketch map is often not the same as that in a metric map because people tend to sketch only the objects that are meaningful to them; and (3) the inaccurate representation of spatial relationships (e.g., distance and orientation): for example, a sketch can qualitatively indicate that one road is longer than another road; however, the quantitative distance between the two roads may not be accurately described.

Materials and Methods
A multilevel road model is proposed for road alignment in Section 3.1. Based on this multilevel road model, the characteristics of individual road (Section 3.2), composite road (Section 3.3), and road scene (Section 3.4) are described and compared between the sketch map and the metric map, respectively.

Multilevel Road Classification
The multilevel road model employed in the approach presented in this paper includes three levels: individual road, composite road, and road scene.  First, the individual roads are extracted from a sketch map. Then, roads in sketch maps are aligned with those in a metric map progressively in three levels. In the first level, individual roads are compared between the sketch maps and the metric map, based on three characteristics: shape distance, number of critical turning points, and circulation direction. Next, according to the matching results of the individual roads and considering the first matching priority, composite roads from the sketch maps and metric map are matched in terms of the topological relationship among their characteristics, order of appearance along one intersection, and relative positions of the intersections. In the third level, road scene matching is performed with respect to (w.r.t.) the road with the second matching priority, characteristics' frequency for a matched road, intersection order along the road, and topological relation between the roads without matching priorities. The details are presented in the following subsections.

Characteristics and Matching Approach for Individual Roads
In this section, the descriptors that represent a individual road are introduced. Furthermore, the approach of matching individual roads between a sketch map and a metric map is presented. The extraction of main roads from the sketch map is described in Section 3.2.1. To represent a individual road, the following three features are adopted: (i) shape distance, which indicates the distance between two roads in terms of the shape (Section 3.2.2); (ii) number of critical turning points, which describes the curvature of individual roads (Section 3.2.3); and (iii) circulation direction, which indicates the direction of two adjacent road segments (Section 3.2.4). The calculation for individual road matching is described in Section 3.2.5.

Extraction of Main Roads
Main roads are the principal routes for regional traffic. According to analysis by Tang et al. [24], main roads exhibit a high similarity between a sketch map and a metric map. The proposed approach attempts to extract the main roads from sketch maps and match them with the roads shown in a metric map. The method reported by Jiang and Claramunt [43] is considered, which is an expansion of the approach proposed by Freeman [44] to classify streets in an urban street network using three factors: 1. The degree centrality (DC), also known as the connectivity, indicates the number of roads connected to a certain road. A given road is considered more important if a larger number of roads are connected to it. In Figure 1a, R2' has three roads connected to it (R0', R1', and R3'); therefore, the DC of R2' is 3. R3' has only one road (R2') connected to it, and therefore, the corresponding DC is 1. Consequently, R2' is more important than R3' in terms of DC.
2. The betweenness centrality (BC) is a measure of the frequency at which a certain road is passed in all the shortest paths of a given location. A higher frequency indicates a greater connectivity of a road. In the proposed approach, BC is determined by calculating the shortest path from one endpoint of a certain road to the endpoints of all the other roads and considering the occurrence frequency of a road in all the shortest paths as the BC in that road in a given location. According to Figure 1b, the occurrence frequencies for each road that appeared in all the shortest paths are 4 (R0'), 4 (R1'), 6 (R2'), and 4 (R3'). This demonstrates that R2' is more important than the other roads in this location, in terms of the BC. 3. The closeness centrality (CC), in graph theory, is based on the degree to which a point is close to all the other points. In terms of roads, the CC indicates the shortest distance from a given road to all the other roads. Because the objective of this study is to determine the main roads in a given location, this factor is not calculated when extracting the main roads.
Specifically, the proposed approach adopts the DC and BC to extract the main roads from a sketch map. Roads with higher values of DC and BC can be considered as the main roads in a location. For example, according to the above analysis, R2' is a main road in the considered location, as it corresponds to the highest DC and BC values. Figure 3 shows a road from a sketch (represented by R0) and the corresponding road from OpenStreetMap (OSM, OpenStreetMap:https://www.openstreetmap.org/). Note that, OSM is a map of the world, and the roads in it are models of real-world roads.  Certain roads extracted from OSM (represented by R1, R2, R3, R4, and R5 in Figure 4) were extracted for further calculation and comparison, as discussed in the subsequent sections.

Shape Distance
The shape distance (SD) describes the shape difference between two roads. The approach proposed by Vatavu et al. [45] was adopted, in which unordered points are used to represent the stroke shape, and the stroke number, order, and direction are ignored. To match two point clouds, an approximation of the Hungarian algorithm [46] is used to obtain the aligned points. Subsequently, the Euclidean distance between each aligned point is computed, and the sum of the distances is considered as the dissimilarity between the two points. Based on this method, in the proposed approach, the dissimilarity is calculated as the SD between two roads, as shown in Formula (1): where SD (Sketch,OSM) represents the SD between two roads, R Sketch represents a road from a sketch; R OSM represents a road from OSM, and dis(R Sketch , R OSM ) represents the dissimilarity of the roads from the sketch map and OSM, based on the approach of Vatavu et al. [45]. When comparing two roads, a smaller value of SD indicates that the shapes of two roads are more similar. Table 1 lists the SD between R0 and R1-R5. Of the five roads (R1-R5), R3 and R4 are the most similar to R0 because they have smaller SD (4.21 and 4.74). In addition, R3 and R4 in Figure 4 have higher visual similarities to R0 in Figure 3a.

Number of Critical Turning Points
The number of critical turning points (NCTP) outlines the structure of a road. A critical turning point is a point at which the direction difference (represented by α in Figure 5) between two neighboring line segments is greater than a certain threshold, as represented by the blue circle in Figure 5. The threshold is based on the features of the road itself, and the minimum value of all the large corner points along a road line is typically selected as the threshold. It is reasonable to consider that two roads are likely to be similar only when the NCTP of one road from OSM is equal to or exceeds that of a road from a sketch, as shown by Formula (2): where SimNCTP (Sketch,OSM)) represents the similarity in terms of the NCTP between two roads; NCTP OSM and NCTP Sketch respectively represent the NCTP of a road from OSM and that of a road from a sketch. Table 2 shows the NCTP for R0-R5, considering a threshold of 36 • . It can be observed that R3-R5 (which can be visually observed to be more similar to the sketched road) have NCTP values that are equal to or more than those of R0.

Circulation Direction
The circulation direction [29] of a individual road (CD) identifies the direction of each road segment w.r.t. to its previous road segment. The proposed approach firstly computes the direction difference for two adjacent road segments with Formula (3) and then obtains the circulation direction of one road segment as depicted in Figure 6.
where p0 represents the start vertex of the first road segment; p1 is the end vertex of the first segment and also the start vertex of next road segment; p2 is the end point of the next road segment. The left and right directions for each road segment are represented as "l" and "r", respectively. Figure 7 shows the CD of two roads. The road in Figure 7a turns right, then left, then right, and finally right again. The road in Figure 7b turns left, then right, then left, and finally left again.
(a) The CD for this road is (r, l, r, r).
(b) The CD for this road is (l, r, l, l). If the CD of one road is the same as that of another road, the two roads are considered to be similar in terms of the CD. If the CD of one road is part of that of another road, the two roads are considered to be partially similar. Formula (4) shows the computation of similarity in terms of the CD.
where SimCD (Sketch,OSM) represents the similarity between two roads in terms of the CD; CD Sketch and CD OSM represent the CD of a road from a sketch and that of a road from OSM, respectively. Table 3 presents the CD of R0 and R1-R5. The CD of R1 is null, as it is a straight road. The CD of R0 is the same as that of R3 and R4. Moreover, the CD of R0 is part of that of R5, which means that R0 is partially similar to R5. These findings are consistent with those of a visual inspection.

Computation of Similarity between Individual Roads
The similarity between two individual roads is the sum of the similarities w.r.t. the shape distance (SD), number of critical turning points (NCTP), and circulation direction (CD) described in the above sections. Formula (5) can be used to compute the similarity between two roads: where Sim R represents the similarity between two individual roads. Sample NUM is the number of sampling points involved in the SD computation, which was assigned a value of 32 in the approach by Vatavu et al. [45]. w 1 , w 2 , and w 3 respectively represent the weight assigned to the similarity in terms of the SD, NCTP, and CD. It holds that w 1 + w 2 + w 3 = 1.0. Sim NCTP and Sim CD respectively represent the similarity between two roads in terms of the NCTP and CD. Furthermore, a larger value of Sim R indicates a higher similarity between two roads. In this work, as all the features in Formula (5) have the same importance, the weights are assigned the same value: 1 3 . These weights may be tuned according to different research needs.

Characteristics and Matching Approach for Composite Roads
A composite road is formed by two individual and intersected roads. The characteristics of a composite road include: (i) the first matching priority (Section 3.3.1); (ii) the topological relationship between the roads (Section 3.3.2); (iii) the order of appearance along an intersection (Section 3.3.3); and (iv) the relative intersection positions in the roads (Section 3.3.4). The approach to compute the similarity between composite roads is described in Section 3.3.5.

First Matching Priority
The first matching priority (FMP) refers to the priority of a individual road during the process of matching a composite road. The accuracy of the matching results obtained by different individual roads differs owing to the particularities of individual roads. The FMP is thus adopted to narrow the search range and increase the matching accuracy. The FMP can be assigned to an individual road that has a higher shape specificity and a higher degree centrality. For example, in Figure 8, R2 and R19 are more curved than R1 and R16; therefore, the FMP will be assigned to R2 or R19. Furthermore, the degree centrality of R19 is 5 and that of R2 is 3, so FMP can be assigned to R19.

Topological Relationship of Composite Roads
The topological Relationship (TR) characterizes the connection between two individual roads. According to analysis by Tang et al. [24], the topological relationships of main roads are highly similar between sketch maps and the metric map. The 9-intersection model reported by Egenhofer and Herring [47] is adopted to describe the topological relationship between roads. This model uses a 3 × 3 intersection matrix to describe the relationship between two features in the inner part (I), boundary (B), and external part (E). Figure 9 shows an example of two topological relationships, namely, touching and disjoint. Open source lib geos (geos:https://github.com/libgeos/geos/) is adopted to compute the topological relationship between two vector roads. If the topological relationships between the two roads that constitute a composite road are the same between the sketch map and metric map, the two composite roads are considered to be similar in terms of the TR. Formula (6) can be used to calculate the similarity in terms of the TR between two roads: where SimTP (Sketch,OSM) represents the similarity in TR between two composite roads derived from a sketch and OSM; TP Sketch represents the TR of the composite road from a sketch; and TP OSM represents the TR of the composite road from OSM.

Order of Appearance of Roads along an Intersection
The order of appearance of connected roads along an intersection (OD), as defined by Herring [48], describes the orientation relationship between connected roads.
As shown in Figure 10, the proposed approach first computes the buffer (represented by Buffer) of a certain radius at the intersection (represented by P) of two roads. Figure 10. P is the intersection of two roads R Sketch 1 and R Sketch 2. Buffer is the buffer of a certain radius at P. Points p 1 , p 2 , and p 3 indicate the intersections between Buffer and the two roads. x 1 , x 2 , x 2 are the x-coordinate values of points p 1 , p 2 , and p 3 , respectively.
In this case, the order of appearances of the connected roads along P are represented by the x-coordinate values (represented by x 1 , x 2 , and x 3 ) of the intersections (represented by p 1 , p 2 , and p 3 ) between the roads and Buffer. In Figure 10, the OD along P is R Sketch 2, R Sketch 1, and R Sketch 1, because the corresponding x-coordinate values of the intersections between Buffer and the roads are x 1 , x 2 , and x 3 , respectively.
If the orders of appearance along an intersection are the same between the roads from the sketch and those from OSM, the roads are considered to be similar in terms of the OD, as shown in Formula (7): where SimOD (Sketch,OSM) represents the similarity between two composite roads from a sketch and OSM, in terms of the OD; OD OSM represents the OD of one composite road from the sketch; and OD Sketch represents the OD of one composite road from OSM.

Relative Positions of Intersections
The relative positions of the intersections (RPI) in two connected roads can further clarify the connection between two roads. As shown in Figure 11, P is the intersection of two connected roads (green and blue lines). L 1 and L 2 refer to the lengths of two parts of the road (in green) separated at P.  Figure 11. P (yellow circle) is the intersection of two roads (green and blue). L 1 and L 2 denote the lengths of two parts of one road (green) separated at P.
The similarity between two roads in terms of the RPI can be determined using the following process: • The lengths of two parts of a road separated by P are computed for both the sketch and OSM aspects to obtain L 1OSM , L 2OSM , L 1Sketch , and L 2Sketch ; • The absolute distances ∆d OSM and ∆d Sketch are computed using Formulas (8) and (9): • ∆d OSM and ∆d Sketch are normalized on the basis of the length of the roads in OSM and sketches to obtain Nor OSM and Nor Sketch (see Formulas (10) and (11)), respectively: • The distance D Nor between Nor OSM and Nor Sketch is compared to obtain SimPI (Sketch,OSM) in terms of the RPI. If D Nor is smaller than or equal to Threshold PI , the composite roads from a sketch and OSM are considered similar, as expressed in (12) and (13): SimRPI (Sketch,OSM) ← D Nor ≤ Threshold PI ? True : False where SimRPI (Sketch,OSM) represents the similarity between two composite roads from the sketch and OSM, in terms of the RPI, and Threshold PI is the threshold defined to compare the similarity in terms of the RPI.

Computation of Similarity between Composite Roads
The similarity between two composite roads is the sum of the similarities in terms of topological relationship (TR), order of appearance of connected roads along one intersection (OD) and relative positions of intersections (RPI) depicted in the above sections. Formula (14) can be used to compute the similarity between composite roads from OSM and a sketch.
where Sim CR represents the similarity between two composite roads. Sim R1 and Sim R2 denote the similarities between two individual roads that constitute the composite road. w 4 and w 5 denote the weights assigned to Sim R1 and Sim R2 , respectively. Sim TP , Sim OD , and Sim RPI respectively represent the similarity between two composite roads in terms of the TR, OD, and RPI. w 6 , w 7 , and w 8 respectively represent the weights assigned to Sim TP , Sim OD , and Sim RPI . It holds that w 4 + w 5 + w 6 + w 7 + w 8 = 1.0. A larger Sim CR corresponds to a higher similarity between two composite roads. As all the features in Formula (14) have the same relevance, the weights are assigned the same value 1 5 . In other works, these weights may be tuned differently.

Characteristics and Matching Approach for Road Scenes
A road scene consists of all the roads in a given location. The characteristics of a road scene include:(i) the second matching priority (Section 3.4.1); (ii) the frequency of a matched individual road appearing in all the resulting composite road matches (Section 3.4.2); (iii) intersections ordered along a individual road with priority (Section 3.4.3); and (iv) the topological relationship between roads without matching priorities (Section 3.4.4). The computation of the road scene similarity is described in Section 3.4.5.

Second Matching Priority
The second matching priority (SMP) is adopted to assign a matching priority to the second road that forms the composite road with the road that has the first matching priority (FMP, see Section 3.3.1), to reduce the matching range. For example, in Figure 12, R19 and R2 are more curved than R1 and R16; therefore, SMP can be assigned to R2 if FMP is assigned to R19. Scene matching is performed on the basis of a composite road with FMP and SMP.

Frequency of a Matched Road
The frequency of a matched road (FMR) represents the frequency of a matched individual road that appears in all the resulting composite road matches. If a individual road from OSM appears in the composite road matches with a higher frequency, it is more likely that the road scene including this road is similar to the road scene from the sketch. For example, in Figure 13, R156137559 appears in all the composite road matches: "Sketch R19 Touches R2", "Sketch R19 Touches R1", and "Sketch R19 Touches R16". It can be inferred that the road scene that includes R156137559 in OSM is more similar to that including R19 in the sketch. Figure 13. R156137559 appears in all the composite road matches. The first level of each tree view in the resulting composite road matches represents each composite road from the sketch. The child nodes represent the matched composite roads from OSM sorted according to the roads' similarities to the parent composite road from the sketch.

Intersection Order of Roads along One Main Road
The intersections order (IO) represents the orientation relationship of other roads along the road with the first matching priority. If the IO along the road with the first matching priority (FMP, see Section 3.3.1) is consistent between two road scene, the two road scenes are considered to be similar in terms of the IO. If the IO of a road scene is part of that of another road scene, the two road scene are considered to be partially similar.
First, the intersections between all the other roads and the road with FMP are determined. Then, all the other roads are sorted according to the coordinate values of these intersections. For example, in Figure 14a, IO along road R19 is {R1, R16, R2}. In Figure 14b, IO along road R19 is {R1, R16, R2, R1}. Because {R1, R16, R2} is part of {R1, R16, R2, R1}, it can be inferred that IO along R19 in Figure 14a is part of IO along R19 in Figure 14b; therefore, the two road scenes are considered to be partially in similar terms of the IO.
The similarity of two road scenes in terms of the IO can be computed as follows.
where SimIO (Sketch,OSM) represents the similarity in IO between two road scenes from a sketch and OSM; IO Sketch represents the IO of a road scene from a sketch; and IO OSM represents the IO of a road scene from OSM.

Topological Relationship between Roads without Matching Priorities
The topological relationship between roads without matching priorities (TRMP) describes the spatial relationship between roads without matching priorities and can further distinguish the similarity between road scenes in terms of the topological relationships. For example, in Figure 15, the green road is a road assigned with matching priority, and the yellow and blue roads are roads without matching priorities. The TRMP between the yellow and blue roads in Figure 15b (touching) is the same as that between the yellow and blue roads in Figure 15a (touching) and different from that between the yellow and blue roads in Figure 15c (disjoint). The similarity between two road scenes in terms of the TRMP can be computed as follows: where SimTRMP (Sketch,OSM) represents the similarity in TRMP between two road scenes from a sketch and OSM; TRMP Sketch represents the TRMP of a road scene from a sketch; and TRMP OSM represents the TRMP of a road scene from OSM. The topological relationship between the roads with TRMP in Figure 15a is the same as that in Figure 15b and different from that in Figure 15c. (a) Road scene from a sketch. The green road has matching priority. The topological relationship between the yellow and blue roads is that of touching.
(b) Candidate road scene from OSM. The topological relationship between the yellow and blue roads is that of touching. (c) Another candidate road scene from OSM. The topological relationship between the yellow and blue roads is disjoint.

Computation of Similarity between Road Scenes
The similarity of two road scenes is the sum of the similarities in each composite road that constitute the road scene, intersection order (IO) and topological relationship between roads without matching priorities (TRMP) detailed in above sections. The similarity between two road scenes can be obtained as follows: where Sim SR represents the similarity between two road scenes; Sim CRi represents the similarity of the i-th composite road that constitutes the road scene; and n is the number of composite roads that constitute the road scene. w 9 is the weight assigned to the sum of all Sim CRi . Sim IO and Sim TRMP respectively represent the similarity of two road scenes in terms of the IO and TRMP. w 10 and w 11 respectively represent the weight assigned to Sim IO and Sim TRMP . It holds that w 9 + w 10 + w 11 = 1.0. A larger Sim SR corresponds to more similar road scenes. In our paper, the weights in Formula (17) have been all assigned the same value ( 1 3 ) since all the features have the same relevance. Other studies may turn these weights differently.

Results and Discussion
We invited participants to draw a sketch of the northern campus of the Xianlin University District of Nanjing Normal University, Nanjing, China (see https://www.openstreetmap.org/relation/ 9356340$\sharp$map=17/32.11603/118.91191), as shown in Figure 16. Eleven participants (four male and seven female participants) with ages ranging from 20 to 30 years sketched the same experimental area. Nine participants had geographical knowledge as their major was in the GIS domain, whereas two participants had no geographical background but had used Google Maps before. Figure 17 shows the sketches, numbered S1-S11, drawn by the 11 participants. OpenStreetMap (OSM) data was taken as the reference data for alignment. The sketches were digitized manually to build the road network. Main roads in each sketch were extracted and adopted for the test application. The approach described above was used to do matching experiments between each sketch and OSM data, as discussed in Section 4.2. S9 S10 S11 Figure 17. Sketches drawn by participants. Note that, the Chinese annotations in the sketches are the categories or names of the objects labeled by volunteers. S10 and S11 were not included in the subsequent analysis because the sketched roads in these two sketches were schematic and did not reflect the road shapes.

Extracting Main Roads
All these sketch maps refer to the same place, but their similarity is low. This might be due to the fact that each participant reflected his/her particular understanding of the place [49]. Therefore, it was necessary to extract the main roads from the sketch maps. It should be noted that certain sketched roads such as those appearing in sketches S10 and S11 are schematic and do not reflect the shapes of the roads. These schematic roads only represent the accessibility or connection between two places/buildings. Therefore, the roads in these sketches were not included in our subsequent road-related matching calculations.
The parameters of degree centrality (DC) and betweenness centrality (BC) were adopted, as described in Section 3.2.1, to extract the main roads from the sketches. Roads with high values of both DC and BC were considered as the main roads. Figure 16d presents all the roads in the experimental area, as extracted from OSM, and the main roads, which are central roads in the experiment area. Table 4 shows all the roads drawn in each sketch except sketch S7, and the main roads extracted from each sketch according to their DC and BC. Only two roads were drawn in sketch S7, so both of them were extracted as main roads.
From Table 4, the following observations could be made: • R2 and R19 were most frequently extracted as the main roads. These roads are consistent with the main roads in OSM, as shown in Figure 16d. The same ID was assigned to the same roads in each sketch to facilitate the analysis. • The number of main roads extracted from each sketch was different, as shown in the lower part of the second column in Table 4. The maximum number was 6 (S5), and the minimum number was 2 (S7); however, almost half the roads in each sketch were extracted as the main roads. • In sketches S3, S4, and S7, all the roads were extracted as the main roads. In S3, the DC and BC values were the same for R2 and R19 and for R0 and R21. In S4, all the roads except for R19 had the same DC and BC. In S7, only two roads were drawn. • In S5, R0, R1, R2, R3, R6, and R19 were selected as the main roads as they had the same values of DC and BC. In S9, although R1, R2, R3, R6, and R15 had similar DC values, R1 was not chosen as a main road owing to its lower BC. This aspect also holds for R0 and R16 in S8.

Road Scene Matching
After extracting the main roads from each sketch, as shown in Table 4, matching experiments were performed between each sketch and OSM to compute the road similarities. The experiments included three steps: individual road matching, composite road matching, and road scene matching. A shapefile containing 15,242 roads of Nanjing, China, from OSM was used as a matching database. The computation time was between 58.2 s and 1803 s. Table 5 presents the matching results from this database.

•
Each row in Table 5 shows the main roads extracted from each sketch, the matching parameters, and the matching results from OSM. • The first column in Table 5 lists the IDs of the sketches, corresponding to those in Figure 17. • The second column lists the main roads extracted from the sketches. In addition, the matching parameters are listed, including MaxMN, which represents the maximum number of roads from OSM involved in the matching of the road from one sketch with the first matching priority (FMP, see Section 3.3.1), and ThresholdCon, which represents the threshold of the direction difference for comparing the circulation direction (CD, see Section 3.

2.4) values. •
The remaining columns show the top-ranked matching results from the database based on the main roads extracted from each sketch. The similarity between each road scene from OSM and that from each sketch is also presented. The greater the similarity, the more similar are the roads between the sketch and OSM. Formula 17 was used to align roads between the sketch map and the metric map. A value of 100 was adopted to quantitatively compute the similarity if the similarity between two roads in Formulas (2)-(4), (6), (7), (13), (15) and (16) was True. In contrast, a value of 0 was used if the similarity between two roads in these formulas was False. The weights involved in in Formulas (5), (14) and (17) were averaged: the values of w 1 , w 2 , and w 3 in Formula (5) were set to 0.33; the values of w 4 , w 5 , w 6 , w 7 , and w 8 in Formula (14) were set to 0.2; and the values of w 9 , w 10 , and w 11 in Formula (17) were set to 0.33.
Furthermore, it should be noted that two different matching results are shown for S5 in Table 5, as the topological relationships of the red and green roads are varied between OSM (disjoint) and S5 (touching). The matching results of the first row for S5 corresponds to the top three results obtained without considering the topological relationship of these two roads. The matching results for the second row for S5 correspond to the top three results obtained considering the topological relationship of these two roads. Figure 18 shows the whole time cost of each sketch in roads matching. It can be found that: • matching of sketch S9 costs the shortest time, because the quantity of main roads extracted in sketch S9 is the least (3), and also two main sketched roads in scene S9 are curved. Note that there are only two sketched roads in sketch S7, so the time cost for matching of sketch S7 is just used for composite road matching. • matching of sketch S6 costs the longest time. As it can be seen from Table 4, sketch S6 has only one curved main road, and the remaining main roads in sketch S6 are all nearly straight. Moreover, there were five roads extracted as main roads in sketch S6 for matching, which correspondingly increased the matching time.

Discussion
As indicated in Table 5, all the sketch maps could be used to query the accurate corresponding roads from OSM, except for S7. The key experimental results are as follows.
Although the roads were drawn partially or distorted in several sketch maps, accurate matching results could be obtained, including those for sketches S1, S2, S4-S6, S8, and S9. In Table 5, S1 has one partially drawn road (red line), but an accurate metric road map was observed in the top matching result. This finding also holds for S2, S4, S5, and S9 (having a partially drawn road in red, blue, green, and purple, respectively). Furthermore, the accurate matching results of these sketch maps were in the top three results.
Accurate matching results could also be obtained for sketch maps with distorted roads. For example, in S4, all the main roads were drawn in a distorted manner except for the one in blue; nevertheless, the exact match was in the second place (see Table 5). S6 (distorted road in green) and S8 (distorted road in purple) also exhibited accurate matching results in the third and first places, respectively.
Accurate matching results could be obtained for sketch maps involving inconsistent topological relationships between roads with those in the metric map. In S5, two roads (in red and blue) were drawn inconsistently with those in the metric map, which led to an inconsistent topological relationship see Figure 19. Thus, two rows of matching results were shown for sketch S5, in which one row shows the matching results obtained considering the topological relationship between these two roads, while the other row shows the results obtained not considering this relationship (see the two rows of matching results for S5 in Table 5). Nevertheless, an accurate matching result (Sim SR = 201.744) was obtained, even when the topological relationship between the two roads was not considered (see the first row of matching results for S5 in Table 5).
Accurate road maps could not be obtained for S7 from OSM, as shown in Table 5. This can be attributed to the small number of roads (only 2) drawn in the sketch and the fewer curves of these roads (see Table 4).
In particular, the characteristic first matching priority (FMP, see Section 3.3.1) can be assigned to a road with more shape curves and a higher degree centrality value in one sketch map. More curved shape can improve the similarity in computing shape distance. A higher degree centrality value means more roads connected to this road as we discussed in Section 4.1 and correspondingly involves more composite roads matchings. For example, in sketch S1, green road is more curved than other main roads extracted. Furthermore, its degree centrality value (3) is the highest as Table 5 shows. This holds for the same road in sketches S2-S7 and S9. Whereas in sketch S8, the same road (in purple) is a straight line, so FMP was assigned to the road in green. Characteristic FMP can help narrow the search range, as demonstrated in Section 3.3.1.

Conclusions and Future Work
This paper proposes a shape-based approach for spatial-query-by-sketch, known as Shaped-based Spatial-Query-by-Sketch. In the work by Tang et al. [24], it was noted that main roads exhibit considerable similarities between the sketch maps and the metric map. Considering this aspect, in this work, the data for main roads were obtained from sketch maps based on two factors, namely, the degree centrality and betweenness centrality. Subsequently, the main roads were compared in three levels quantitatively and qualitatively-individual road, composite road, and road scene. In the first level, we calculated the similarities between two individual roads considering quantitative characteristics, such as the shape distance, and qualitative characteristics such as the circulation direction. In the second level, the topological relationship and quantitative characteristics of a composite road (i.e., composed of two individual roads), such as the relative positions of the intersections, were aligned between the sketch maps and the metric map. In the third level, all the main roads extracted from a sketch map were further quantitatively and qualitatively compared with those in the metric map based on the road scene characteristics, e.g., the frequency of a matched road. Next, the matching results were sorted based on the sum of the characteristic similarities. An experiment was performed, in which nine sketch maps drawn by nine participants were aligned with the metric map derived from OSM. The results indicated that for eight out of nine sketch maps, accurate road maps could be obtained from the metric map. The inaccurate matching of the remaining sketch map was a result of the small number of roads in the sketch (only 2) and the presence of only a few shape curves in the roads. Furthermore, it was noted that accurate matching results could be obtained for sketch maps with partially drawn roads or distorted roads and even for roads with an inconsistent topological relationship with those in the metric map.
The following aspects are expected to be a part of future work: • Comparison of the spatial relationship between roads and buildings, such as the orientation relationship, topological relationship, and ordering relationship. In particular, in some sketches, only a few roads are drawn, which makes it challenging to match these roads. Combining the sketched roads with buildings may improve the accuracy of matching. • Development of a method to match the sketched places with discrete sketched roads. Discrete sketched roads are road segments generated during the process of digitization. Specifically, in this work, the roads in the sketch maps were digitized manually and the road integrity was guaranteed. In future work, the digitization will be conducted automatically, by decomposing the roads in the sketch maps into several line segments. Note that the shape of a discrete sketched road may be different from that of a completely sketched road, and the topological relationship between such roads may also be different. Therefore, in future work, we intend to realize matching between discrete sketched roads and complete roads in the metric map. • Application of the proposed model to an unknown experimental area. In this study, the road similarity was examined considering roads in familiar places. An objective of future work is to study the characteristics and consistency of roads in unfamiliar scenes. We particularly thank all those who participated in our experiment, as well as our colleagues who provided helpful and significant comments.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: