Improved Indoor Positioning by Means of Occupancy Grid Maps Automatically Generated from OSM Indoor Data

: In recent years, there is a growing interest in indoor positioning due to the increasing amount of applications that employ position data. Current approaches determining the location of objects in indoor environments are facing problems with the accuracy of the sensor data used for positioning. A solution to compensate inaccurate and unreliable sensor data is to include further information about the objects to be positioned and about the environment into the positioning algorithm. For this purpose, occupancy grid maps (OGMs) can be used to correct such noisy data by modelling the occupancy probability of objects being at a certain location in a speciﬁc environment. In that way, improbable sensor measurements can be corrected. Previous approaches, however, have focussed only on OGM generation for outdoor environments or require manual steps. There remains need for research examining the automatic generation of OGMs from detailed indoor map data. Therefore, our study proposes an algorithm for automated OGM generation using crowd-sourced OpenStreetMap indoor data. Subsequently, we propose an algorithm to improve positioning results by means of the generated OGM data. In our study, we used positioning data from an Ultra-wideband (UWB) system. Our experiments with nine different building map datasets showed that the proposed method provides reliable OGM outputs. Furthermore, taking one of these generated OGMs as an example, we demonstrated that integrating OGMs in the positioning algorithm increases the positioning accuracy. Consequently, the proposed algorithms now enable the integration of environmental information into positioning algorithms to ﬁnally increase the accuracy of indoor positioning applications.


Introduction
Indoor positioning has received much attention in recent years due to the vast amount of applications that employ position data. According to [1], indoor positioning is the process of determining the location of objects or persons in indoor and closed space environments in real time. Positioning systems are applied to localize and track assets in production buildings, to navigate persons through indoor environments or to analyze a person's trajectory in elderly care applications, for example. Such systems use different types of technologies, such as inertial sensors, visual markers, cameras, time of flight (ToF) sensors, or WiFi-based technologies. All these localization techniques have different disadvantages in indoor environments, which lead to inaccurate localization results. A comprehensive survey of different technologies and their properties is provided by [2]. WiFi signals, for instance, can be interfered by metallic objects, ToF-based approaches require a line of sight and inertial sensor data is prone to error accumulation. Even technologies such as ultra-wideband systems with a theoretically achievable accuracy of 10 cm can be influenced by the environment, so that the positioning error reaches values of up to 3 m. As a consequence, the acquired sensor data can be inaccurate and unreliable, which results in invalid localizations, such as persons detected within a wall.
In order to compensate for localization errors, it is necessary to include further information about the object to be positioned and about the environment into the positioning algorithm. One possibility to improve the localization accuracy is the integration of indoor map data: the given structure of buildings with its specific spacial dimensions, such as corridors, stairways and doors, allows an elimination of invalid positions. This principle is presented in a variety of publications (e.g., [3][4][5][6]), where sensor data from optical sensors (cameras) or radio sensors (Bluetooth Low Energy or WiFi) are matched against indoor map data to avoid implausible states and to improve positioning results. Moreover, by considering the structure of indoor environments, both probable and improbable object occurrences can be defined. A person walking through a building will probably not walk directly near the wall and will definitely not pass through a wall. An incorrectly acquired position within a wall could then be adjusted to a valid position next to the wall. The described example is visualized in Figure 1. Further existing positioning technologies as well as methods that are implementing such position corrections are described in Section 2. The occurrence probability of a specific object within a building can be modelled by means of so-called occupancy grid maps (OGMs), which represent the occupancy probability of an object on the floor plan of a building. An OGM is modelled by a cell matrix whereas each cell is a square area of the indoor environment holding the probability of being occupied. Hence, generating OGMs requires floor plans and consequently indoor data of buildings. Since indoor data about building had either been generally not available or is only provided in form of Computer Aided Design (CAD) formats, alternative standards for indoor mapping have been established: Open Geospatial Consortium (OGC) CityGML [7], OGC IndoorGML [8], Building Information Modelling (BIM) [9], and OpenStreetMap (OSM) with the mapping scheme Simple Indoor Tagging (SIT) [10]. An overview of these standards is given in [11]. With currently over 70,000 mapped and freely available indoor rooms [12,13], OSM is a very extensive data source for research. A comparable extent of indoor data based on the other standards is not available to the knowledge of the authors. This wide distribution of indoor data and the free availability of map editing tools show that OSM is an ideal standard and data base for the generation of OGMs, so that it can be applied for a huge amount of buildings.
To date, little attention has been paid to the involvement of OSM indoor data in OGM generation. Moreover, the employment of OGMs generated from such data for positioning improvement has not been investigated so far. This article therefore examines OSM indoor map data as a data source for the generation of OGMs and introduces a procedure to create such OGMs as an input for indoor positioning algorithms. Subsequently, we propose an algorithm to improve positioning results by means of OGM data. In our study, we used positioning data from an Ultra-wideband (UWB) approach. Our results demonstrate that the proposed method decreases the positioning error.
The article is the extended version of the paper presented at the Conference on Geographical Information Systems Theory, Applications and Management (GISTAM) 2020 [14] and is structured as follows: Section 2 presents state-of-the-art methods for positioning as well as for OGM generation and outlines the research gap. Thereupon, Section 3 introduces the proposed automated procedure to generate OGMs from OSM indoor data by illustrating the system concept overview and subsequently describing the realization of the single system modules. Additionally, this section describes how the generated OGMs can be applied to improve the positioning. The obtained results for OGM generation and positioning are presented and discussed in Section 4. Finally, Section 5 concludes the article and gives an outlook on future work.

Related Work
This section presents extant work on positioning and OGM generation and outlines the research gap the present study aims to close.

Positioning
This section introduces positioning technologies and explains the employed positioning system in more detail. First of all, Table 1 provides an overview about state-of-the-art positioning technologies with their respective features and their applicability for modern Android or iOS based smartphones. Finally, the table thereupon outlines the advantages of Ultra-wideband. Subsequently, working principles of existing UWB-systems are contemplated and the principle of the UWB-based system that has been developed for this study is explained in more detail. very accurate positioning (mean error below 10 cm), requires defined camera orientations and a direct Line of Sight to markers, range is affected by obstacles, privacy issues, user interaction with the smartphone necessary [15][16][17] Bluetooth Low Energy (BLE) until Bluetooth 5.0: Evaluation of Signal Strength, low accuracy, prone to noise, short range, cost-efficient hardware, battery-powered with long operating times, Bluetooth 5.1 allows Angle-of-Arrival (AoA) and Angle-of-Departure measurements, accurate positioning, requires special Beacons and end-user devices due to the need of antenna arrays, more expensive [18][19][20] Wireless Fidelity (WiFi) requires fingerprinting, does not scale with large buildings, prone to environmental changes, can be used with existing access points, Channel State Indicator (CSI) shows reasonable results, but is not accessible with the development Application Programmable Interfaces (APIs) of modern smartphones. IEEE 802.11mc allows round trip time measurements for position estimations with accuracies below 1 m, requires special access points [21,22] Near-field communication (NFC) high accuracy, very short range (20 cm), user interaction with NFC tags required [23] Magnetometers, gyroscopes, inertial sensors no additional infrastructure required, low to medium accuracy, fingerprinting required, device-specific sensitivity [24][25][26] Radio Frequency Identification (RFID) commonly used for tracking of capital goods, requires expensive readers or special smartphones, tags are encoded with unique identification number, range of up to 7 m, low accuracy of approx. 5 m [27,28] Ultra-wideband (UWB) high accuracy, high bandwidth allows handling of multipath propagation, long range, up to now available on iPhone and Samsung mobile phones, requires extra hardware as infrastructure, capable of AoA measurements to further improve accuracy or to reduce the amount of infrastructure [29][30][31] When considering the advantages and disadvantages of the presented technologies, we have chosen Ultra-Wideband (UWB) as an appropriate positioning solution to fit in our navigation application: Compared to others, UWB provides very high accuracy and a high operation range with relatively low integration and maintenance costs. Furthermore, currently this technology is integrated in an increasing number of smartphones.
There are different UWB principles existing in commercially available systems. In essence, they rely on two main principles: Two-Way-Ranging (TWR) or Time-Differenceof-Arrival (TDoA) [32]. TWR calculates the position of a mobile device by determining the distances between the device to be positioned and each infrastructure node. Thereby, each distance is measured by means of the round-trip time a radio message requires to be exchanged between mobile device and infrastructure node. Due to the high message traffic, this method limits the number of participants that can be positioned concurrently. TDoA faces the same problem of limited participants: Each device to be positioned sends messages to all infrastructure nodes within range. Each signal propagation time is then forwarded to a central computer calculating the device position and forwarding this position back to the mobile device. The required accumulated transfer times for all messages result in high traffic and consequently limit the number of participants as well. Moreover, both methods are prone to eavesdropping due to the two-way communication between mobile devices and the infrastructure or central computers respectively.
In contrast to these two methods, the UWB approach used in this study, which is called Reverse TDoA, allows privacy by design as well as an unlimited number of participants. The principle of this UWB approach is briefly explained in the following: Privacy by design and the unlimited number of participants are achieved by a pure transmission mode of the infrastructure nodes and a pure reception mode of the mobile devices, which are also called UWB tags. In addition and in contrast to the previously explained methods, the infrastructure is now synchronized. As a result, the signal propagation times transmitted by the infrastructure are evaluated on the mobile devices to calculate the device position. The principle is visualized in Figure 2. Positioning concept: so-called Ultra-wideband (UWB) Satlets (infrastructure nodes, also called anchors) are distributed in the indoor environment and transmit positioning signals that are received by mobile devices, also called UWB tags, within the indoor environment. In that way, a person holding a mobile phone for example can be located.
Even though the raw data provided by the employed UWB system are already highly accurate compared to other localization techniques mentioned in Table 1, OGMs can enhance this accuracy even more, which will be shown in the results section later.

Occupancy Grid Maps
Occupancy grid mapping was initially introduced by Moravec and Elfes in 1985 [33]. Originally, this mapping procedure was developed for noisy sonars and called "mapping with known poses". In literature, especially in the field of probabilistic robotics, occupancy grid mapping is often referred to as the process of generating maps from noisy and uncertain sensor data while the position of the robot with the attached sensors such as cameras, laser range scanners and LIDAR is known [34][35][36]. In this mapping problem, the aim is to build an occupancy map of the environment, in which the occurrence of obstacles is stored.
For positioning/localization, the opposite problem has to be solved: Based on an existing map, the position of objects shall be derived, also in the presence of noisy sensor data. In our case, the existing map is an OSM indoor map that has to be transformed in an occupancy grid map first. In this context, OGM generation is the transform of a floor plan into independent discrete cells. Each cell stores a variable estimating the grade of its occupancy. The variable can either be binary or continuous, stating whether the cell is occupied or not or indicating the grade of occupancy, i.e., the occupancy probability of the object to be localized.
Extant literature gives insight on how OSM maps are transferred to OGMs and thereafter used for localization purposes.
In their publications, Kurdej et al. present a localization system for intelligent vehicles that uses OSM outdoor map data as a-priori information [37,38]. This systems generates OGMs based on OSM road and building information and matches sensor data from optical sensors against these OGMs.
Herrera et al. are the first to generate OGMs from OSM indoor maps [39,40]. Their algorithm derives the OGMs from a manually defined graph network that overlays the indoor map data. This graph consists of nodes, which were defined by empirical studies and denote probable indoor positions. However, these nodes have to be manually added to the graph.
Naik et al. proposed OSM-based indoor data for robot navigation and generated a primitive OGM for that purpose [41]. This generation methodology involves only a limited set of objects, namely information about rooms and corridors. Moreover, the OGM distinguishes between only two occupancy states.
To summarize, the presented related work either focuses on OSM outdoor data or is lacking a full automation of the OGM generation and is only covering a small part of building features. Therefore, we propose a methodology for a highly automated generation of OGMs based on OSM indoor maps that is involving as much as possible information about the interior of a building.

Methodology
This section firstly provides an overview about the algorithm with its single steps and explains the realization of every step as well as the input and the output data in detail. Secondly, the proposed method to improve the indoor positioning by means of OGMs is presented.

Automated OGM Generation
Our proposed algorithm for OGM generation is divided into three steps: coordinate transformation, data sorting and OGM rendering, as can be seen in Figure 3. The OSM data used in this publication was mapped by the authors according to the Simple Indoor Tagging (SIT) scheme [10] and is not publicly accessible due to sensitive data within the building under data protection law.  In the first step, the input OSM map data, which were represented in World Geodetic System 1984 (WGS-84) format has to be transformed into OGM coordinates. Therefore, the coordinates of the indoor map data were firstly converted to a metric representation and to a local coordinate system (LCS) that, together with its origin and orientation, had to be defined depending on the specific positioning application that used the OGMs. This transformation was necessary to represent all objects, i.e., both objects from the OSM map as well as objects to be located, in the same local and metric coordinate system. A transformation of indoor data into local metric format brought the benefit of compatibility to other devices. Other devices might be industrial robots also working with a local metric coordinate system or devices processing a given grid map. Afterwards, a transformation from the metric LCS in OGM coordinates was applied based on a manually defined OGM resolution. Thereby, the OGM coordinates represented the cell indices of the OGM.
During map data sorting, all indoor map objects were assigned to a specific priority level that corresponded to a layer in the rendering process, where the OGM was rendered layer by layer in a fixed order. This assignment was based on the OSM tags of each geo object. For the OGM rendering, we defined a certain order in a lookup table to achieve sensible OGM outputs. This was necessary, because geo objects within OSM data sets may be overlapping. For instance, room areas and their walls were described by two separate geo objects, whereas the mapped area of the room may be overlapping with its mapped walls. At this point, the OSM mapping scheme considered walls as a second layer over room areas, so that the mapped wall boundaries defined the real-world physical walls above underlying room areas. Now consider an exemplary rendering output of unsorted data: rendering the walls of a room before rendering its area would lead to a loss of wall position information due to the "over-rendering" of walls by the room area. This example illustrates the necessity of a sensible data sorting in accordance with OSM mapping schemes. Next to the chronological order objects were rendered, the layer specified the occupancy probability for objects in that level, and the shape the object was rendered with.
Finally, the OGM was built by rendering the map objects that were assigned to the specific layers with their according probabilities.

Input Data
The algorithm required three sets of input information: The OSM indoor map data, the origin of the local coordinate system and the desired grid map resolution. The OSM indoor map data were stored within an Extensible Markup Language (XML) file including indoor geo objects, such as rooms, walls, doors and corridors, characterized by a set of nodes with longitude and latitude coordinates as well as by OSM tags, which described the meaning of each object.
The second required information was the position and orientation of the LCS. The position consisted of a WGS-84 coordinate (latitude and longitude) and the orientation was defined by the rotation angle between the ordinate of the WGS-84 coordinate system and the ordinate of the LCS. The origin of the LCS as well as the rotation angle could be set by using the JOSM (Java-OpenStreetMap)-Editor [42] with measurement functionalities provided by plug-ins [43].
The grid map resolutions were the third input data of this algorithm and they were specified in grid cells per meter. In our article grid cells were denoted as pixels and the resolution therefore is represented in pixels per meter (px/m). This resolution was used for the conversion of geo object positions to the OGM coordinate system, which used pixels as units.
The described input data were parsed at the initialization of the algorithm and stored in an internal data structure for further processing in the following computing steps.

Step 1: Coordinate System Transformation
Because indoor positioning systems are comprised of several system components with their own single coordinate systems, it is necessary that all the different components share the same local coordinate system in the overall indoor positioning application. Such components can be different kind of sensors, robots or algorithms that further process positioning data.
For OGM generation, the input data needed to be transformed into OGM coordinates, which then represented the originally metric dimensions of an indoor environment in OGM pixel coordinates, as already outlined in Figure 3.
By means of a geographic library [44,45], the indoor map coordinates were transferred from WGS-84 format into a metric local coordinate system by solving the inverse geodesic problem (IGP), which determined the shortest route between two points on the surface of the Earth. The following Algorithm 1 presents the data flow in detail: Algorithm 1: Algorithm to calculate local metric distances (∆x ω and ∆y ω ) between each indoor coordinate and the origin of the LCS.
Data: OSM indoor data with indoor coordinates (coords) and origin of the LCS (origin) in WGS-84 coordinates Result: indoor data in local metric coordinates create empty result list ; foreach indoor coordinate (coord) in coords do solve IGP for coord and origin ; extract metric distance between coord and orign (d) as well as the azimuth at origin (azi) from IGP ; calculate ∆x ω = d · sin azi ; calculate ∆y ω = d · cos azi ; create local coordinate of ∆x ω and ∆y ω and add to result list ; end return result list Thereby, the results were the metric distance components on the latitude and longitude arc ∆x ω and ∆y ω between the origin of the LCS as well as a node of a geo object at the position (x L1 , y L1 ). Afterwards the rotation angle α of the LCS was applied to finally transform ∆x ω and ∆y ω into local x and y coordinates. Finally, the map data in LCS coordinates were transformed in OGM coordinates. This was achieved by manually defining the resolution of the OGM in pixels per meter (px/m) and afterwards multiplying each LCS coordinate with this resolution value to obtain rounded OGM coordinates in pixels (px). The choice of the resolution depended on the accuracy of the positioning system, whose results shall be improved. For instance, an UWB system with a precision of 1 cm could use an OGM with a resolution of 1 px/0.01 m = 100 px/m.

Step 2: Map Data Sorting
As already described in Section 3.1, an ordered map data set is necessary for generating an OGM with a layer-wise rendering methodology that also involves over-rendering. Though, due to the structure of the OSM data definition, the geo objects within the indoor map data are unsorted and might be overlapping. Therefore, a lookup table that assigned relevant tags of the geo objects to a specific rendering layer was created. This lookup table is shown in Figure 5.  The first rendering layer (L1) held basic indoor geo objects with the lowest limitations for positionable objects, such as rooms, corridors, steps, stairways or elevators. In that sense, lowest limitations refers to the highest probability for a valid indoor position. The second layer (L2) contained all kinds of walls that definitely restricted the freedom of movement of positionable objects. Openings, such as doors and entrances, were considered separately to be placed as a third layer (L3) atop of walls and to enable the overwriting of limitations set by the walls.

Step 3: OGM Rendering
As a final step of the OGM generation, the rendering was performed. Thereby, every floor of a building with its specific geometry resulted in a separate OGM. Consequently, a canvas was created for every floor and the dimensions of these canvases were defined by the lowest and highest OGM coordinates of every floor, which designated the canvas boundaries.
The rendering itself handled 8 bit grey scale values, which encoded positioning probabilities in a range from 0% (0.0) to 100% (1.0) with a resolution of approximately 0.4%. When rendering indoor areas of Layer L1, i.e., rooms, corridors or steps, these areas were filled with a grey scale value of 0.75 as it is shown in Figure 6b. Using a probability value of 75% instead of 100% allowed it to subsequently add popular paths with even higher probabilities, so the grid map could be optimized in retrospect in case such frequently used paths are known.
This step was followed by rendering a gradient tube adjacent to the inner boundaries of the resulting area of Layer L1, see Figure 6c. By means of this gradient behavior, a lower probability of positions at the boundaries of indoor areas was modelled. Hence, the occupancy probability decreased towards the walls in the shape of a smoothed circle in our case, while the grey value the circle was filled with was linearly reduced with increasing radius. Other shapes, such as normal distributions, were sensible as well. The dimensions of the shape were relative to the size of the indoor area.
In the next step, the walls of Layer L2 were rendered atop of the L1 area. Walls were also modelled as areas and are filled with a positioning probability of 0.0, which ensured that this area was inaccessible for positionable objects and persons. This rendering step was visualized in Figure 6d.
As shown in Figure 6e, Layer L3 included openings and their probability behavior of the area around them. The openings were rendered atop of the previous layers and represent a positioning probability of 0.75. Empirical experiences have shown that people enter or leave openings in a shape similar to a funnel. In case of openings that were accessible in both direction, two funnels were used for rendering, so that the resulting shape resembled a hourglass shape. The funnels were rendered perpendicular to the wall surrounding the respective opening. Finally, applying the complete algorithm to OSM indoor data of a certain building delivered an OGM for each level of this building, which was the output data of the OGM generation algorithm.

Method
The method of positioning improvement is described using the example of a university building. Figure 7 visualizes the method: To obtain more accurate positioning information and to exclude implausible positions (e.g., positions within walls), we calculated the Hadamard product H, i.e., an element-wise multiplication, of a Gaussian mask G (shape and size are explained below in Section 3.2.2) and the OGM generated from the building. This OGM was denoted as OGM in the following equation.
To calculate H for a specific measurement, the center point of G ws aligned with the measured position. Then, H was calculated according to Equation (1). In a subsequent step, the corrected position in OGM pixel coordinates was obtained by extracting the position of the maximum in H. This step is visualized in Figure 8. The final position in metric coordinates was obtained by means of the resolution, which was 0.01 m per pixel in both x and y direction in our example.

Gaussian Mask Parameters
The Gaussian mask was constructed to have a size of 6 m × 6 m and a standard deviation of 1.5 m. These parameters modeled the accuracy of the UWB positioning system that was affected by so-called Hard Non-Line-of-Sight conditions (NLOS) and by the additional errors near the border regions of the positioning area defined by the lines connecting the outer UWB satlets of a setup. The NLOS conditions reduced the accuracy of UWB systems down to approximately 1 m [46]. Due to the mentioned additional errors in border regions, the accuracy was even more reduced [47], resulting in our estimated standard deviation of 1.5 m, modelling a lower confidence. The mask size of 6 m × 6 m was set according to two times of the chosen standard deviation in order to cover approximately 95% of the area below the Gaussian function. To construct the Gaussian mask in pixels the metric sizes were multiplied with the resolution of the OGM. For instance an OGM resolution of 10 px/m led to a mask size of 60 px × 60 px.

Results and Discussion
This section provides and discusses the results for the automated OGM generation. Additionally, the methodology for evaluating the impact of the positioning improvement by means of OGMs as well as the according results are presented and contemplated.

OGM Generation
We performed experiments with nine different building map data sets, which have qualitatively shown that the algorithm delivers reasonable OGMs. An example for one of these generated OGMs, which represented a complete floor of a building, is shown in Figure 9. When comparing the input, i.e., the OSM indoor data, with the rendered OGM, it can be seen that the different objects of the three layers were correctly represented in form of probability grey scale values. Our study therefore proved that the proposed automated generation of OGMs from crowd-sourced OSM indoor data provided reliable results, provided that the indoor environment was mapped correctly.
Nevertheless, the algorithm still had three limitations, which should be contemplated in future work: Firstly, the algorithm did not automatically evaluate the quality and correctness of the input OSM data. Because these data were mapped by volunteers without any special training, the data can be very imprecise and even necessary features such as doors may be missing and could therefore negatively affect the resulting OGM. With the current implementation, no automatic validation of the input data was performed, so that a manual plausibility check of the generated OGM had to be performed. Consequently, validating the input data is still an open subject to be solved. Secondly, hourglasses, which were rendered with a fixed pre-defined size at door positions, were not seamlessly connected to the base probability of the indoor area. This is because the width of the probability gradient at the borders of the indoor area depended on the size of this area (as noted in Section 3.1.4). Accordingly, the size of the hourglass should be dynamically adapted to the room size as well. Thirdly, the current placement of the hourglass in narrow corners of a room led to an unwanted overwriting of wall information.

Positioning Improvement
In order to evaluate the influence of the generated OGM on the positioning accuracy, we installed several UWB Satlets in our example building that finally spanned an inner positioning area. This building had a length of 22 m in x direction and a width of 10 m in y direction. Two setups with different reference positions (RP) were defined in the example building. The UWB tag to be located was placed at three different reference positions for each setup. Setup 1 (S1) validated the influence of OGMs when five Satlets were used and is visualized in Figure 10. Setup 2 (S2) investigated the effect of OGMs when the amount of satlets used for positioning was decreased. In this setup, measurements with an infrastructure of four Satlets were performed for three different reference positions.
For each reference position, which was specified by means of laser distance measurements with an accuracy higher than 0.01 m, we collected several UWB measurements (up to 500 per RP). Firstly, we calculated the mean error in x and y direction e x and e y as well as the overall mean error e between the measurement and the reference position without employing the OGM. Thereby, the overall error was defined as the Euclidean distance between the measurement and the reference. Secondly, we employed the OGM to obtain corrected positions and calculated e x , e y and e again. The results without and with OGM employment are summarized in Tables 2 and 3. Beside that, we used empirical cumulative distribution function (ECDF) plots to rank the overall position errors for all measurements of each reference positions.
In the following, the evaluation setup as well as the results for S1 are presented.  For S1, Table 2 and Figure 11 show that OGMs had only a minor influence on positions that already had sub-meter accuracy. We also applied two-sample t-tests to analyze the statical significance of the differences between errors with and without OGMs. The results of these tests confirmed the minor influence of the used OGM for RP 1 and RP 3. Likewise, the t-test for RP 2 proved the significantly positive effect that was already visible in the CDF plot in Figure 11. Nevertheless, it can be noted that OGMs increased positioning accuracy in our examples. In the following, the evaluation setup as well as the results for S2 are presented. The setup is illustrated in the following Figure 12. The different samples are encoded with red for RP 4, green for RP 5 and blue colour for RP 6. Thereby, the circle denotes the reference, the rectangle the measured and the star the corrected position. The transparent grey rectangle is the inner positioning area of the UWB system, which is spanned by the satlets. Having a closer look at S2 with Figures 12 and 13 and Table 3, we can see that for RP 4, the employment of the OGM improved the positioning significantly: Without employing the OGM, the UWB system located the UWB tag very close to the wall and also sometimes within the wall, while the reference was actually in the doorway. With the help of the OGM, the incorrect UWB measurements were shifted to the door. This reduced the mean errors in x and y direction significantly. This setup demonstrated that for scenarios described in Figure 1, OGM improved positioning outputs, which we could also verify with a twosample t-test. In contrast to RP 4, the improvements for RP 5 and RP 6 were not that obvious. Although we could actually see an improvement in y direction where the measurements were shifted away from the wall, the error in x direction remained constantly high. This is because the UWB tags were located very close or even outside of the inner positioning area, so that the located position was error-prone due to reasons already mentioned in Section 3.2, see [47]. We furthermore checked the statistical significance of the results for RP 5 and RP 6 by performing two-sample t-tests. The results showed that the differences between these errors were still statistically relevant even if they were not that beneficial for improving the positioning result in this very example. Consequently, RP 5 and RP 6 demonstrated the limitation of OGMs: Although implausible measurements close to walls could be shifted away from the wall, OGMs were not capable of correcting completely incorrect localization outputs.

Conclusions and Future Work
In this article, we dealt with the problem of improving accuracy for indoor positioning. Prior work has demonstrated the effectiveness of integrating external data sources such as OGMs, which are representing occupancy probabilities on indoor maps, in the positioning process. These previous approaches, however, have focused on OGM generation for outdoor environments and require manual steps. Moreover, commercially provided map data such as Google Maps does not allow free access to the raw data required for OGM generation. To counter these issues, we proposed to use feely available crowd-sourced data, namely OSM data that are mapped by a large community. We furthermore proposed a method that for the first time automates the generation of OGMs for indoor environments. Finally, we evaluated our proposed procedure by demonstrating the benefit of integrating the automatically generated OGMs in positioning on a real example. Our results provide evidence for increasing positioning accuracy but also reveal limitations. These limitations for both OGM generation and positioning improvement have been discussed. In the following, future work for both OGM generation and positioning improvement is presented.

OGM Generation
In addition to the limitations concerning OGM generation that were presented in the previous section and which will be handled in future, there is a further development planned: In future versions of this method implementation, points of interest (POIs) and popular paths in buildings will be integrated in the OGM, as already pointed out in Section 3.1.4.
The kind of POIs depends on the use case of the intended positioning system. For instance, an indoor navigation for museum visitors must consider the area around paintings as places with high probabilities for positions. Because paintings are typically mounted at walls, the occupancy value of the OGM in such areas must be increased.
For the definition of popular paths both manual as well as automated approaches can be applied: a manual solution is to ask several persons to manually draw paths in the map they think probable to be frequently used. More appropriate, however, would be automatized methods. One sensible solution is a learning-based approach where most frequently used paths are derived from the actual positioning output and a kind of heat map is generated. The more detections are registered in a cell of an OGM, the higher is the path probability, i.e., the heat, of this cell. A further method could determine paths by applying skeletonization algorithms on the indoor map areas whereas the remaining topological skeleton denotes these paths. Finally, another option is to determine direct paths between relevant objects, for example the direct lines of sight from one door to other doors in a room.

Positioning Improvement
In view of employing OGMs for positioning improvement, the scope will be as follows in future: In the current implementation, the standard deviation and the mask size are set to a fixed value. However, these parameters should dynamically be adjusted depending on the confidence the current position is measured with. In future, the UWB system used in this study will deliver such confidence information for each measured position, which can be used to dynamically adjust the standard deviation in the OGM correction step. In that way, measurements close to or outside the border of the positioning area can be treated with both a higher standard deviation and a larger mask size to find more probable positions.
Besides that, we plan to evaluate the proposed method in further and larger buildings, such as our university library. For this purpose, we have to create or use existing OSM indoor maps for these buildings, generate the according OGMs, integrate UWB positioning systems in the building infrastructure and define reference positions for the evaluation. Finally, this improved UWB positioning system shall be used for indoor navigation applications. Based on the current user's position, information about surrounding landmarks can be forwarded to the user to facilitate his or her orientation within an unknown building. Moreover, similar to outdoor navigation, the user's position shall be used to update the route to a target within the building.
To sum up, existing indoor positioning technologies are always prone to errors and can lead to inaccurate and implausible positions. We demonstrated the benefit of generating OGMs from crowd-sourced maps and thereupon proposed an automated OGMs generation algorithm. Furthermore, we integrated these OGM data in an indoor positioning algorithm using UWB and improved positioning results.