Visual Overlay on OpenStreetMap Data to Support Spatial Exploration of Urban Environments

Increasing volumes of spatial data about urban areas are captured and made available via volunteered geographic information (VGI) sources, such as OpenStreetMap (OSM). Hence, new opportunities arise for regional exploration that can lead to improvements in the lives of citizens through spatial decision support. We believe that the VGI data of the urban environment could be used to present a constructive overview of the regional infrastructure with the advent of web technologies. Current location-based services provide general map-based information for the end users with conventional local search functionality, and hence, the presentation of the rich urban information is limited. In this work, we analyze the OSM data to classify the geo entities into consequential categories with facilities, landscape and land use distribution. We employ a visual overlay of heat map and interactive visualizations to present the regional characterization on OSM data classification. In the proposed interface, users are allowed to express a variety of spatial queries to exemplify their geographic interests. They can compare the characterization of urban areas with respect to multiple spatial dimensions of interest and can search for the most suitable region. The search experience is further enhanced via efficient optimization and interaction methods to support the decision making of end users. We report the end user acceptability and efficiency of the proposed system via usability studies and performance analysis comparison. ISPRS Int. J. Geo-Inf. 2015, 4 88


Introduction
Volunteered geographic information (VGI) is geographic data collection primarily acquired through the voluntary effort of citizens [1].In recent years, the quality and quantity of VGI data underwent a fast-paced worldwide development.One of the most utilized and popular VGI-platforms is OpenStreetMap (http://www.osmfoundation.org),whose main goal is to create a freely-available geographic database of the world.There have been many studies approving of the quality of OSM, and more specifically, it has been demonstrated that OSM is able to capture the detailed dynamics of urban areas [2,3].Citizens contribute thoroughly towards the enrichment of geospatial regional characteristics of urban environments.However, the reverse applicability of such rich data sources for citizen's spatial decision support has been limited by the conventional search and exploration interfaces.
Urban environments, or living cities, share the regional characteristic of a high degree of organized complexity based on spatial components: shops, offices, houses, green spaces, transportation networks, natural features, etc. Citizens need location-based services (LBS) to assist their decision making through the geographic space of the urban environment around them.At present, the majority of research and development effort in LBS is focused on providing information access about specific locations, facilities and geographic points of interest.However, more complex search types and supporting analysis are desired that enable a combined view of the underlying geographic data.In various information-seeking situations, users are not only interested in the specific geo entities, but the composition of urban areas and geographic regions.Users look to query, analyze and compare the geographic regions, which is not possible with the current Web 2.0 and local search interfaces, as their presentation methods are liable to be towards definite locations and entities.Users go beyond the popular place names and location for the characterization of their region of interest [4].Users could be interested in finding appropriate regions in many decision making scenarios.A person who needs to move to a new place may look for an appropriate region to live or need to compare various cities; a businessman opening a new shop/store looks for a beneficial neighborhood or even a tourist would want to visit interesting regions to fulfill her sightseeing desires in less time.
Our related requirement study [5] in analyzing such scenarios gave us the insight that the multiple criteria of interest become simultaneously significant.As part of a requirements analysis, we performed a user study to better understand geospatial multi-criteria decision making.We provided the participants with several scenarios of spatial decision making, such as moving to a new place, the comparison of cities, etc.We gathered their opinion about the ideal user interface to support these tasks, and how the existing tools would fulfill their needs for such scenarios.One of the main observations was that participants were dissatisfied with existing interfaces and expressed a strong tendency to visualize the geographic regions and their characterization.Especially in a scenario like moving to a new place, users tend to compare the new places with their current living environment, as well as a desire for better support for neighborhood filtering.We therefore aim to provide an interactive interface for the search and exploration of the urban environment.
In this paper, we utilize OSM data to classify the characteristics of the urban environment and provide an interactive visual interface for the spatial exploration of this data and to support the spatial decision making of citizens.We extend our earlier interface [6] to provide a more in-depth knowledge, structure and environment of the urban environment.The proposed system provides more user control and presents a search framework for multi-criteria regional querying, exploration and city-based comparison using OSM data.We enrich the spatial information with the topographic distribution of geographic areas of interest.We integrate efficient ranking methodologies to support the regional search process.In the proposed framework, users could compare the characterization of urban areas with respect to multiple spatial dimensions of interest and could search for the most suitable region.
The rest of the paper is organized as follows.First, in Section 2, we give a brief summary of related applications and approaches.In Section 3, we present the details of our spatial data extraction framework from OSM.In the subsequent sections, we describe how end users could interact with the extracted spatial data for decision making scenarios.Section 4 describes the methods to characterize the user's information need.We describe the functionality of different facets that assist end users to explore spatial data in various contexts in Section 5.In the proposed interface, users can not only explore the spatial data, but can also search and rank the regions of interest.We provide the details of the ranking mechanism in Section 6.In Section 7, we describe the evaluation study of the framework.Finally, Section 8 concludes the paper by discussing the contribution of the paper and directions for future work.

Related Approaches
There have been several studies on the quality of VGI and OSM data [2,3,7,8].These studies have shown that the OSM data in urban areas are of high quality and mostly comparable to commercial datasets.Hence, in this work, our focus is to enhance the applicability of this data for citizens via interactive visualization overlay.There have been several approaches that apply visualization methods on OSM data [9][10][11]; however, the focus has been primarily on improving the data quality itself as compared to supporting end user location-based needs.There are some applications of OSM data that extend the current phenomena of location-based services, such as finding directions and routes [12] or domain-specific applications [13].However, the applicability of such high-quality urban data for the search and multi-criteria exploration of a relevant geographic area is still a challenge.There are local search services and geographic information retrieval systems [14,15], which provide access to geo entities.These geo entities usually belong to a criteria/category or set of categories (shopping, education, sport, etc.).Several location-based services and yellow pages follow the categorical structure for the overview of geo entities.In the current geo applications, users could view the geo entities that belong to a particular category on the map, but the realization of multiple categories is usually not supported.Therefore, end-users are expected to accomplish decision making tasks through sequential querying/browsing of categories from different data sources on the web, which could be an extremely complex and time-consuming task.
In the research community, multi-criteria spatial problems have often been approached from the computational perspective [16,17], which is focused on distance and density estimation of locations with respect to different criteria.Rinner and Heppleston [18] proposed geospatial multi-criteria evaluation for home buyers where decision criteria were based on: location, proximity and direction.Even though their task has a similar scenario of spatial decision making as discussed in our work, the study was conducted for a focused group of real-estate agents and the contribution was more on the computational aspects rather than the visualizations.In comparison, we focus on visualization methods, i.e., how end users perceive and analyze multiple criteria on the local search interface for spatial decision making.Some geovisualization approaches [19,20] explore the problem of multi-criteria analysis, but they usually target specific domains, like healthcare, and support the decision analysis of a focused group of experts.In general, geospatial decision making has been one of the main challenges and applications of visual analytics [21].For geo-related content, visual analytics is prominently being used (http://www.geovista.psu.edu,http://www.geoanalytics.net/).There have been many visual analytical models and tools developed to support critical business decision making processes, but assisting lay users in their decision making is still a major research challenge [22].
In this work, we target the users of local search services and propose map-based geovisualization to support their multi-criteria decision making.Jankowski et al. [23] have presented an exploratory approach for multiple criteria spatial problems, where they emphasize the role of maps in decision making scenarios.Martino et al. [24] proposed an integration of Google Earth within OLAP (online analytical processing) tools for multidimensional exploration and analysis of spatial data.Similar to these approaches, a map is the focus of our proposed interface; however, we integrate layers of visualizations to accomplish the spatial decision scenarios of local search users.We aim for the knowledge enrichment of the citizens about urban areas using visualization models on VGI data available on the web.The visualization methods have showcased the enhanced usability of VGI data sources [25,26].Moreover, there have been several approaches and tools to extend the applicability of VGI data sources [27,28]; however, most of these approaches are focused on urban planning and city modeling in explicit use cases, such as traffic simulation.More specifically, the research in urban analytics has also been centered on urban planning and decision making for the improvement of urban infrastructure [29][30][31][32].

OpenStreetMap Data Framework
In this work, our aim is to provide easy access to spatially-relevant information from OSM, so that the end users could view the knowledge about geographic areas from regional perspectives.Hence, our geospatial database consists of a huge collection of geo-entities, which are extracted and composed from OSM.We categorize the OSM data to a facility and topographic index structure.The facility distribution contains criteria, such as shopping, education, medical facilities, religion, etc.The topographic distribution is based on the landscape arrangement, e.g., water, greenery, forest, mountains, etc., or the land use distribution, e.g., residential, commercial, industrial areas.The choice of category listing is inspired from the user requirement analysis [5], which was conducted to understand the information need of users in regional search and decision making scenarios.
OSM provides several methods to access the geospatial map-based data.We used the XML files from Geofabrik (http://download.geofabrik.de),which contain all spatial characteristics within a selected area.These areas could be continents, countries or provinces.The OSM XML files consist of three basic elements: nodes, ways and relations.Nodes represent a specific point on the Earth's surface defined by its latitude and longitude; ways are a combination of nodes and can be a closed way to represent a polygon; and relations add special information to nodes and ways.The physical features on the ground are represented using tags attached to its basic elements.Each tag describes a geographic attribute of the feature being shown by that specific node, way or relation.In this work, we exploit the tag information for the categorization of geo entities.The geo-located facilities are basically categorized via the tag information of nodes, since each facility contain a standard address with a street name, house number and associated functional information, such as "amenity"="pharmacy".
Every node, way or relation can contain various tags, which, for example, describe that a way is a highway or a street.The landscape information is represented by closed ways that contain tags with information like "water", "beach" or "farmland".In the initial process we extract all closed ways that seems to be landscape information into a topography index to have a comprehensive collection.We create indices with Apache Lucene (http://lucene.apache.org/core).Subsequently we define interesting regions, like cities or provinces, as bordering squares.Each square gets divided into a grid with a defined amount of cells.For each cell, we find the dominating land use and landfill category to store this information into two new indices, a land use and a landfill index.To get the intersection between the cell and the closed ways from the topography index, we used "spatial4j", a geospatial library (https://github.com/spatial4j/spatial4j). The tag information of every closed way within the cell gets categorized.For landfill, we select the categories "water", "green", "forest", "dump/waste" and "mountain/rock", and for land use, we select "commercial area", "industrial area", "residential area", "road" and "railway".The dominating category is the largest area within this cell.This category is saved for this cell in both indices, land use and landfill index.We get a grid for every interesting region with the dominating categories in it; hence, it is easy to display these cells on a map or to analyze the cells within a geometric form.
End users could interact with this information via relevant visualization mechanisms described in the following sections.We used Google Maps as a map framework, and the visualizations were built using Data Driven Documents (http://www.d3js.org), a JavaScript library for visualization design.The color scheme was chosen with the support of ColorBrewer (http://www.colorbrewer.org).The proposed visual overview and facets were the result of a user-centered design process, where we have experimented with different layouts, numbers and sizes.In the following subsections, we describe the interface functionalities in detail.

Characterization of Geographic Information Need
The OSM search interface is in parallel with several other local search services, which presents sequential results to end users, e.g., a search for "Restaurant in Vienna" will return all restaurants in Vienna as a list view and/or icons on a map interface.However, in the current information age, the spatial information need of an end-user is often much more complex than such simple, above-mentioned queries.
In several scenarios of spatial decision making, users look for multiple criteria of interest simultaneously, which is not very convenient with the sequential querying and categorization approach of current local search interfaces.This is particularly true for persons that are facing a complex decision making task, e.g., somebody who has to move to a new city and is looking for a new living place.In this example, the user will most likely come up with a variety of different criteria of geo entities that he would like to visualize together, e.g., the availability of shopping facilities, medical facilities and a good connection to public transport.In this work, we provide end users the capability to describe multiple criteria of interest simultaneously.Figure 1 shows the example of a user query interface with multiple criteria.Here, users could select and weight multiple categories for the search of the appropriate region.The last row signifies the size of the region which the user is looking for with respect to specified criteria.

Map-Based Characterization
In addition to the search for multiple criteria, users might want to interact with the map interface to select an area on the map for regional exploration.Most of the current geovisualization interfaces focus on administrative map boundaries.Figure 2 shows the administrative map of Dresden city, where the users could select particular district areas of the city for exploration and comparison.In the shown example, the user selected a district in the center of the Dresden city map.

Query by Spatial Example
Moving beyond the traditional map boundaries, we offer users the ability to arbitrarily define their own spatial region of interest.The free definition of the query region is important, since users may not always want a neighborhood that is defined by traditional boundaries, e.g., a user might be interested in a particular region because of personal geographic preferences, such as the presence of a river nearby, a garden, a church and good connectivity to public transport.Furthermore, from our previous GIR research [15], we got the impression that the user of a geo search system needs more control of the input parameters.In particular, a free drawing technique has been desired several times, e.g., to encircle certain areas on a map, which then serves as a new reference point for further analysis.We therefore proposed geographical queries in our model by spatial example, where users can define the query region by drawing on map.The user could select the area via circular or polygon boundaries.Figure 3 shows an example of a user-selected region of interest via a polygon query (by mouse clicks and drag) on the map.

Spatial Overview and Exploration
In the proposed visual interactive system, users could choose a particular city or multiple cities for spatial exploration.Users could select multiple regions of interest via polygon/circular selection or the administrative districts of the city map. Figure 4 shows the web interface for spatial exploration of the proposed system, where Munich has been selected as the reference city.Each region could be explored via multiple visualizations with tab views.The example shows three different regions of the city via different facets of the OSM data source.End users could visualize the spatial summary of geographic regions and cities based on a categorical heat map overlay on the extracted data, which consequentially allows them to explore and compare the selected regions.The pie charts provides the detailed information of the categorical distribution of regions when the users hover over it with a mouse cursor.This provides the detailed information on the availability of geo-located facilities in the selected regions.The examples in Figure 4 show the regional overview via the OSM data source.The left-most polygon region elaborates the facility-based distribution, where the selected region has lots of shopping, food and drink facilities; there are some medical facilities in addition to a few other minor categories.The circular region on the bottom shows the natural land cover view, which is primarily distributed into greenery, forest and dump.The associated view option of the circular region at the top shows how the land in the selected region has been used commercially.Figure 5 shows another example of the visual overview over the administrative map of Dresden.The interface shows how three different districts of the city are distributed with respect to geo-located facilities.

Exploration of Geo Entities
The pie chart visualization provides a quick overview of the distribution of geo-located facilities.However, a user might like to explore the detailed information about the facilities and topographic distributions.In such scenarios, the user could click the "show more details" option at the bottom of the pie chart window.Figure 6 shows the detailed view of the central district of Dresden.When a user selects a particular category, all the geo entities associated with the category get listed, and also shown by icons on the map.We provide all of the specific information, such as name, location and website information, about each geo entity.

City Comparison
In some scenarios, people like to visualize the infrastructure of whole city and compare the city-based distribution rather than some selected regions [33].We extend the regional explorative interface for city-based comparison to cater such decision scenarios when citizens need to explore and compare various cities. Figure 7 shows the example of a city-centric overview and exploration based on the aforementioned OSM categories.The user could select a city or multiple cities for spatial exploration.In the shown example, five German cities have been selected, and the user can choose respective tabs for the overview of the city infrastructure based on the pie chart visualization on the right side of the interface.The user could hover over a particular category to visualize how it is distributed on the city surface (in the shown example, the user visualized the areas surrounded by forest in Hannover).The point of view selection offers the comparison of a selected city with others.In the given example, Hannover is selected as a point of view, hence the criteria and topographic bars of each city represents its respective facility or landscape/land use similarity to Hannover.There are a few existing interfaces for city comparison (http://www.moving.com/real-estate/compare-cities, http://www.numbeo.com),which provide the analysis based on the demographics or cost of living.However, the exploration and comparison of cities based on geo-located facilities and structure could be vital for spatial decision making.

Search of Relevant Regions
In various decision making situations users not only need to explore various regions of interests for the existing facilities, but need to compare and find relevant regions to their criteria of interest.In such scenarios, users could specify their requirements by one of the input methods described in Section 4. With respect to the input query, users could either select few target regions that need to be ranked, or ask the system for most appropriate region in a geographic space such as city.The search system shows the relevance of target regions by the percentage of similarity with the heatmap based relevance visualization.We used a color scheme of different green tones, which differed in their transparency.While light colors represented low relevance, dark colors were used in order to indicate high relevance.

Ranking of Selected Regions
Figure 8 shows the example of the search scenario when user want to rank selected regions of interest.Here, two German cities, Hamburg and Leipzig, were selected by the user.The user intends to compare different geographic regions of Leipzig (target regions) with respect to a certain relevant region in Hamburg (query region).The system allows user to specify the current region of interest via a polygon query on the Hamburg map and to get the relevance of target regions in Leipzig in ranked order.Users can do a visual comparison of the query region with the most relevant target region via their respective multifaceted distribution.In the shown example of the criteria distribution, we could perceive the distributions of the region with 83% similarity being very similar to the query region.The example shown in Figure 5 shows different districts of Dresden city ranked (shown with the color transparency) in comparison to one particular district in the center (in red).
When a user is not entirely satisfied with the criteria distribution of a selected query region, he could interact and adapt the distributions of the selected region.Interaction models enable users to influence the ranking and retrieval process.We provide the draggable pie chart where the user can increase or decrease the importance of categories by dragging its respective boundaries.The significance of other categories gets automatically adjusted in the equal proportion.The interaction process allows the user to add, edit and delete the categories.After modifying the categorical distribution of geo-located facilities, users can revisualize the target regions to get their updated rank of relevance.
The methodology for the ranking of regions is inspired from information retrieval (IR) ranking.In IR relevance modeling, the documents are ranked based on their likelihood to generate a query.Similarly, we judge the relevance of a target region based on its possibility to generate/replicate the query region.Each region R is represented by a category distribution (C 1 , C 2 , .., C n ).Each category C i is based upon the geo entities, which belong to category i.The relevance of a particular target region R t is based on the similarity of its categorical distribution with respect to the query region R q .There could be various measures to estimate the distribution similarity [34,35].However, to compute the association between geographic regions by mean of their criteria distribution, we found the Euclidean distance-based distribution similarity measure [36] to be most appropriate.

Ranking Optimization
There are scenarios when users look beyond the specific target regions and would like to search the entire city space with respect to a query region.However, dividing the entire city space into all possible target regions is a computationally complex solution.Hence, we offer an optimization method to find the most relevant region in the selected map space.The proposed method is based on the particle swarm optimization framework (PSO) [37].To apply the PSO algorithm for a regional search, we mapped some concepts of PSO to the spatial decision problem [38].The particle stands for a circular region on the map as a candidate solution.For a particle p i , the position X i stands for the current location of p i on the map, i.e., the geocoordinate of the center of the circular region.Velocity V i implies the movement direction of the particle on the map surface.In the initial step, a set of particles as candidate solutions are generated with random position and velocity.Then, the particle updates its velocity and position based on Equations ( 1) and (2).The particle is evaluated by a fitness function (similarity to the query region), and the local best position of the particle (P best ) and the global best position of the swarm (G best ) are updated based on the fitness value of the particle.The individual particles are drawn stochastically toward the position of their own best performance and the best performance of the swarm.
Here, i stands for the current generation of the iteration process.w, c1, c2 are the momentum coefficient, recognize coefficient and social coefficient, respectively.rand(), Rand() are the functions that generate random numbers between zero and one.
In the example shown in Figure 9, the user intends to find the most suitable region in Vienna with respect to a specified region in Berlin (query region).Hence, the user selected a circular region of interest on the Berlin city map.Once the user has provided her area of interest, the system triggers several particles in the city map of Vienna.Each particle has the same diameter as the query region.These particles start moving across the Vienna map to find the best region with respect to the user's criteria.Users could view the particles and swarm movement on the map interface, and they could also control the speed of the swarm via the top panel of the interface.Here, they could also spot the current status (found similarity) of the search process.The swarm would keep searching the map space until it discovers the best region with the highest possible similarity or if the user terminates the search process.In this example, the user terminated the search process when the swarm had found a region with 98% similarity.The best region is indicated with the dark green color and its similarity value.The transparent red circles are showing the last positions of particles when the search process was stopped.

Evaluation
We describe the two-fold evaluation process as follows; first, we discuss the studies being conducted to analyze the general usability aspect of the proposed interface.Afterwards, we summarize the results of automatic evaluation being conducted to analyze the ranking phenomena.

System Usability Studies
We conducted several small-scale studies to investigate how the proposed urban visualizations are able to support regional search and exploration.We intend to find out the general perspective and acceptability of such geographic visualization among end users.Here, we summarize the result of two studies conducted with similar settings.In total, 18 volunteers (14 male, four female) participated in our study.They were aged between 20 and 54 years.To evaluate the application, we had given two scenarios to the participants: the first was more of an exploration task, and the second scenario was to evaluate the comparison of regions.The participants were asked about how far the application could assist in their spatial decision making.Overall, 16 of the 18 participants mentioned that the system could help them to explore and find relevant regions.Since none of the current local search interfaces provide similar functionalities of multi-criteria regional search, we could not employ a baseline system.However, we asked the participants if they could accomplish such tasks by means of current interfaces and applications (e.g., OSM, Google maps), to which 14 answered no, and four participants thought it could be possible with more time and effort using multiple searches.

Performance Analysis of Ranking Method
To evaluate the efficiency of our optimization-based search approach, we performed experiments on huge geospatial data of Germany and Austria.Results indicate that our search framework is able to locate relevant areas in a computationally-efficient manner.We compare the PSO-based search method with the more conventional approach of regional search, where each possible region in the map boundary becomes the candidate solution.This inherently means that the entire map space would be devised in a grid raster [26].For the 1,400 test runs, PSO could achieve significant performance in 1.05 s compared to 13.99 s taken by a complete search.We observed that the performance gap between complete search and PSO methods significantly depends on the complexity and size of the search space [38].This indicates that the optimization-based regional search method is even more beneficial when the magnitude of spatial database is larger and could be a practical application in scenarios when the search space is really huge.

Conclusions
Geoweb 2.0 sources, such as OpenStreetMap, contain relevant spatial information about urban infrastructure and facilities.However, the associated end user interfaces do not support the search and exploration of geographic regions and cities, which is an important aspect in several spatial decision making scenarios.In this paper, we proposed interactive interfaces on OpenStreetMap data, so that the end-users could specify, explore and rank the urban areas of interest.We proposed various input methods to characterize the spatial information need of end-users.Easy exploration of spatial areas is supported via layers of heatmap and icon based visualization.Regional ranking and optimization methods have been integrated to provide the search assistance in decision making scenarios.We recognized the acceptability of the proposed user interface and ranking algorithms by means of the user feedback and automated performance evaluation.
The proposed visualization and search system is designed for OpenStreetMap data; however, the methodologies are not restricted to a particular VGI data source.This could be employed to portray any kind of geographic point-based data for regional distribution.In future, we would propose the regional search framework with various kinds of geospatial data sources, and make the system available for real-world users.

Figure 4 .
Figure 4. Spatial overview of user-selected regions.