3.1. Data Source
Within the navigation service developed, OSM was chosen as the data source. The primary reason for this decision was, as mentioned earlier, that the data, which contain both geographical and semantic information, are openly accessible and, in the target areas for usage (predominantly urban areas of Western Europe), can be seen as relatively complete. The underlying data structure for OSM comprises three primitive feature types: nodes, linestrings and relations. Nodes are point-like features; linestrings are a collection of joined node features; and relations are used to describe relationships between features that may not be obviously physically connected (i.e., bus stops on a bus route). Polygon features are not directly stored in OSM, but are instead generated from closed linestrings. As well as the geographic coordinates being stored for the features, attributes are also stored in the form of “tags”. These tags are key-value pairs, and each feature can have any number of tags (though no key can be duplicated in the same feature). For example, a hospital building could be represented as a closed linestring with the tags building = yes and amenity = hospital. Within OSM there is in fact a key for landmarks, though globally less than 1700 features have a tag with this key.
When looking at OSM data for the Greater London area (U.K.), several characteristics can be identified regarding aspects that could be used in landmark identification. When assessing the availability of information required for individual characteristic methods (such as color and height), it becomes evident that there is a lack of such information. Of the 291,801 polygon features identified as being buildings (contains a building tag, which is not set to no), 6822 (2.3%) contain a tag indicating the color (building:colour or colour) and 398 (0.14%) carry a height value (building:height or height). For the 299,572 nodes that have some attributes associated with them (they have one or more tags), only 228 (0.076%) contain a color value and 57 (0.019%) contain a height value. Of all features in the greater London area, 14 nodes (0.005%) have a landmark tag, with no polygons/linestrings carrying the tag.
On the other hand, if particular types of features are used (see
Table 1 for a list of feature types), then 22,299 (7.4%) node features are obtained. If the polygon features are queried, then 15,860 features of 359,847 polygons (4.4%) are identified. The polygon search parameters were relaxed to include all polygons in this search and not just buildings. This is because features such as parks and playgrounds, which could be seen as landmarks, would not necessarily be buildings. From these values, it is clear that OSM generally lacks the information about individual visual characteristics of features, but a large number of features contains information regarding the type/usage.
Obviously, as mentioned earlier, the completeness of OSM data varies between locations, but in an urban setting where pedestrian navigation is of more use, the presence of data should be sufficient (in particular in Western Europe). In the county of Hertfordshire in the U.K. (~1640 km
2), which has a similar area to Greater London (~1570 km
2), there are only 2225 nodes and 2306 polygons that match the feature type parameters. These figures represent 4.0% of 57,855 polygons (any polygon in the dataset, not just buildings) and 3.6% of 61,859 nodes (containing at least one tag). Obviously, the number of real-world features that the landmarks can be drawn from is much smaller in the Hertfordshire area, which has a much lower average population density (approximately 700 persons/km
2) than Greater London (approximately 5500 persons/km
2), which is a result of far less urbanized area in Hertfordshire than Greater London.
Figure 1 shows the landmark candidate density based on the feature type method within the Greater London and Hertfordshire areas.
From this investigation into the availability of landmark candidates in these two areas, it can be seen that in more densely-populated areas, there is a larger pool of landmark candidates. This is likely due to two reasons: in higher density population areas, there would likely be more amenities and facilities that could be landmarks, and it has been shown in the literature that higher density population areas are more complete with regards to OSM data [
39]. Therefore, it must be noted that in less densely-populated areas, the effectiveness of any tool relying on OSM data could be diminished.
Overall, the information regarding the usage of feature types vs. individual characteristics and the distribution of features in urban and rural regions has shown that considering the type of feature yields many more landmark candidates, but even then, the availability of candidates in rural areas is limited. Such results should be taken into account when assessing the usage of OSM data in landmark identification processes as performance in one geographical area may not be the same in another.
3.2. Landmark Identification Methods
Using OSM as a source of information about features in the environment that could become landmarks, methods based on previous studies in the field of landmark identification and navigation have been implemented. These processes have been integrated into a service (described in
Section 4) that generates navigation instructions that contain landmarks for a route. This section will now discuss the methods used in the identification process which focusses on the identification of six primary attributes: visual/semantic saliency, distance from waypoint, visibility, position, location, and uniqueness.
When identifying a suitable landmark for a decision point on a route, it is important to know the location of the decision point, the direction the traveler is coming from and the direction they should end up taking. Once the location of the decision point is known, the first step is to select all features within the vicinity that could be a possible landmark for the decision point. This is accomplished in the service by selecting all features that are identified in OSM as being of a particular type or usage within a 50-meter buffer around the turning point. This distance is a value selected based on experimentation with different distances to cover enough area to provide landmark candidates, but not so large as to include features at too large a distance. Using a smaller value also increases the performance of the service. In the case that the previous decision point is less than 50 meters from the current decision point, the buffer distance is set to be the distance between the previous and current decision points. This ensures that landmarks are not selected behind the traveler. The type/usage tags used for selection can be seen in
Table 1. The tags have been selected based on a combination of previous studies, the frequency of tag usage in OSM and the opinion of the authors regarding the generic feature types that could be seen as landmarks under the correct circumstances. This will however be updated in future prototypes to take into account feature types identified by end users.
In some cases, such as features marked as shops, additional information must be present, which indicates the name or brand of the shop. This is because in many cases, multiple shops would be in the same vicinity, and thus, the extra information is needed to distinguish individuals.
When selected, each feature is also assigned a salience weight value, which is shown in
Table 1. This value represents a proposed
visible/semantic saliency value when used in the overall suitability scoring. These saliency weights have been initially derived by the authors following similar principles to those presented by Duckham, Winter and Robinson [
17]. After initial determination, the values have been subjected to iterative changes based on the results obtained from running the extraction process. It was identified that in some cases, particular feature types that were popular in an area were becoming dominant even though other factors (such as distance from the decision point and uniqueness) should make the individual features less suitable. By updating the visible/semantic saliency weight, such over-selection was reduced.
Once the landmark candidates have been selected, geometric evaluation is performed to identify additional aspects of the features in relation to the decision point, which can contribute to the overall suitability as a landmark. These processes aim at determining:
The distance of the feature from the decision point,
The visibility of the feature as the decision point is approached,
The position of the feature in relation to the decision point (before, after, alongside)
The location of the feature in relation to the current direction of travel (to the left or right)
Within the calculation of these aspects, a number of locations on the route and landmark candidates are required. These locations are:
The waypoint (decision point) itself (WP)
The reference point on the current route segment (RP). This point is a location on the route approaching the waypoint that is the same distance from the waypoint as the value used as the distance of the buffer for selecting features in the area (default 50 meters)
The point on the perimeter of the landmark candidate that is closest to the waypoint (LWP)
The point on the perimeter of the landmark candidate that is closest to RP (LRP)
In the case that the feature is a point, some extra steps are taken before the aforementioned locations are derived. Firstly, if the point feature is located within a building polygon (as is often the case in OSM), then it is moved to the closest point on the perimeter of the building footprint. If it is not within a footprint, the location of it stays the same. Next, a small buffer of 0.000001 decimal degrees is placed around it and the resultant perimeter of the buffer used as the feature. The main reason for applying this buffer is that it ensures that the algorithms described can be applied to both point and polygon OSM features, as ultimately both are represented as polygons within the processes.
Distance from the waypoint: When looking at distance from the waypoint, it has been identified from the literature that closer features to the waypoint are often more suitable. Therefore, the indicator for distance is determined by calculating the Euclidean distance from
WP to
LWP. This value is then normalized before final suitability calculation to be between zero and one based on the equation:
where
D is the final value,
d is the Euclidean distance between
WP and
LWP and
MD is the maximum distance signified by the maximum search distance for landmarks (50 meters in this implementation).
Visibility of feature: For determining the visibility of a feature, a simple and naive approach is currently being used. Though not optimal, the method is relatively fast and would give correct results in most circumstances, and an assessment of visibility is often seen as a very important metric [
10,
22,
25,
26]. To calculate the visibility, a line is created between
RP and
LRP. This line is then compared to building footprints in the area to determine the visibility based on intersections with the footprint polygons. The building footprints are identified from OSM by selecting all features in the area that are a polygon and carry a
building tag. Rather than simply identifying if the line crossed a building footprint, the length of any intersection was calculated. In the case that no footprint was crossed, then this value would be zero. However, due to artifacts created from geographic projection and slight data inaccuracies, it is often the case that the landmark candidate would have minute intersections with the feature in the building footprint set that represents it (i.e., a polygon feature for a church would be present in the candidate dataset, and the same polygon would be present in the building footprint dataset). Using a minimum threshold value (approximately 10 cm), it is possible to remove the majority of errors introduced due to this problem. The resultant calculation is a binary value of zero if the feature is not visible on the approach to the waypoint (there are intersection lines with building footprints of more than 10 cm) and one if it is visible.
Position in relation to the decision point: Calculation of the location of the feature with regards to the decision point is a more complicated value to derive as it requires a number of comparisons. The end result is an indication that the feature is before, after or alongside the waypoint. The metric of location in relation to the turning point is derived by identifying differences between the distances of
WP to
RP,
RP to
LRP and
RP to
LWP. In the case that
RP → LRP and
RP → LWP are less that
RP → WP, it can be assumed that the landmark candidate is before the turning point. If
RP → LRP is greater than
RP → WP, then the landmark can be assumed to be after the turning point. When
RP → LRP is less than
RP → WP and
RP → LWP is greater than
RP → WP, then it can be assumed that the landmark candidate is found alongside the
WP. Looking at
Figure 2, it can be seen that the distance between
RP → LRP1 is less than
RP → WP, but
RP → LWP1 is larger than
RP → WP; therefore, that feature is said to be alongside the turning point.
RP → LRP2 and
RP → LWP2 are both larger than
RP → WP; therefore, that feature is seen as being after the turning point. Finally, both
RP → LRP3 and
RP → LWP3 are smaller than
RP → WP, and so that feature is seen as being before the turning point.
Figure 2 shows the information used for calculating the position of three features in relation to a turning point.
The location of the feature in relation to the current direction of travel: The final geometric calculation is used to determine whether the feature is on the same side of the road as the turning point, or on the opposite side. To determine this, the azimuth angles between LWP → RP and RP → WP are compared. If the angle LWP → RP → WP is between −180° and 0°, then the feature is seen as being on the left of the waypoint. If the angle is between 0° and 180°, it is deemed as being on the right-hand side. This information is then compared to the direction of travel to determine whether the feature is on the same side as the turning or not.
After calculation of the geometric aspects of the features in relation to the decision point, the
uniqueness of the feature within the candidate set is determined. Obviously, if there are multiple features of the same type in the vicinity of the decision point, it will be more difficult for the user to determine which feature is being referenced as a landmark. The uniqueness of the feature is calculated using:
where
Ut is the uniqueness of feature with type
t and
nt is the number of features within the candidate population with a type of
t. This calculation results in features whose type does not occur again in the candidate selection as having a value of one and those with multiple occurrences having a proportionally smaller value (two features of the same type would both have a uniqueness value of 0.5).
Once all of the values described have been calculated, they are used to derive an overall suitability score. The landmark candidate with the highest score is used as the landmark in the routing instruction. The final suitability metric
S is determined as:
where
V is the visibility of the feature (zero or one),
P is the position of the feature in relation to the decision point (after = 1, alongside = 2, before = 3),
Ld is the location in relation to the direction of travelling (opposite side = 1, same side = 2),
D is a normalized distance from the waypoint value (zero to one, with one being closer),
U is the uniqueness of the type in the candidate set (zero to one with one being the most unique) and
Sa is the proposed salience based on the feature type (zero to one with one being the highest salience value). By using the visibility metric as a multiplier, it is ensured that candidates that are not visible on the approach are always given a suitability value of zero. For position, one is given to candidates that are after the way point, two for those alongside it, and three is given to those candidates that are before the way point. The weighting values (i.e., two for alongside and three for before the turning point, and weightings for the side of the road) have been determined based on findings from the literature where it is documented that landmarks before and on the same side as the direction of turning are preferable [
17,
26]. In the case that the decision point does not require a change in direction (the person continues forwards), the value for
Ld is always set to one no matter what side of the road the feature is located.
As an example of this process, consider a decision point as shown in
Figure 3. In that location, the traveler is coming from the west, must cross the street and then continue east. The waypoint (WP) is depicted as the five-pointed star, and the reference point (RP) is the six-pointed star. Based on visibility, only three of the 15 candidates can be seen as being suitable as landmarks, as all others can only be seen when relatively close to the waypoint (the building footprints are shown as solid polygons). The resultant values used in the calculation of suitability can be seen in
Table 2. As the direction of travel is to continue forward, the value used for location in for all features is deemed to be one, no matter which side of the road they are located. From the potential candidates, Feature 3 is deemed to be the most suitable (1 × 3 × 1 × (0.597 + 0.5 + 0.8) = 5.692). This results in the instruction being “After the Salisbury pub, continue forwards”.
Though the usage of these metrics in a landmark suitability calculation is not a new aspect of the research, their implementation here shows that all of the required information is available from the OSM dataset. All calculations are performed automatically and require no additional input other than this single dataset.