Indoor Localization Based on Weighted Surfacing from Crowdsourced Samples

Lin, Junhong; Wang, Bang; Yang, Guang; Zhou, Mu

doi:10.3390/s18092990

Open AccessArticle

Indoor Localization Based on Weighted Surfacing from Crowdsourced Samples

by

Junhong Lin

¹,

Bang Wang

^1,*

,

Guang Yang

¹ and

Mu Zhou

²

¹

School of Electronic Information and Communications, Huazhong University of Science and Technology (HUST), Wuhan 430074, China

²

Chongqing Key Lab of Mobile Communications Technology, School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China

^*

Author to whom correspondence should be addressed.

Sensors 2018, 18(9), 2990; https://doi.org/10.3390/s18092990

Submission received: 9 August 2018 / Revised: 28 August 2018 / Accepted: 3 September 2018 / Published: 7 September 2018

(This article belongs to the Special Issue Pervasive Intelligence and Computing)

Download

Browse Figures

Versions Notes

Abstract

:

Fingerprinting-based indoor localization suffers from its time-consuming and labor-intensive site survey. As a promising solution, sample crowdsourcing has been recently promoted to exploit casually collected samples for building offline fingerprint database. However, crowdsourced samples may be annotated with erroneous locations, which raises a serious question about whether they are reliable for database construction. In this paper, we propose a cross-domain cluster intersection algorithm to weight each sample reliability. We then select those samples with higher weight to construct radio propagation surfaces by fitting polynomial functions. Furthermore, we employ an entropy-like measure to weight constructed surfaces for quantifying their different subarea consistencies and location discriminations in online positioning. Field measurements and experiments show that the proposed scheme can achieve high localization accuracy by well dealing with the sample annotation error and nonuniform density challenges.

Keywords:

fingerprinting localization; sample crowdsourcing; sample weighting; surface fitting

1. Introduction

Fingerprinting has been extensively researched for indoor localization systems in the last decade [1,2,3,4]. The basic idea is based on the assumption that each indoor location can be identified by a unique signal feature, called fingerprint. The widely used fingerprint is a vector of the received signal strengths (RSS) from the access points (AP) of wireless local access networks. The location of a test fingerprint can be estimated to a known location with minimal signal difference. One of the key challenges to support such fingerprinting localization is to construct an indoor radio map in the offline training phase [5,6,7,8]. Normally, the indoor environment is divided into non-overlapping grid cells. Site survey is often used to collect RSS samples at each grid center by surveyors for training one grid fingerprint for each grid. However, this scheme suffers from the time-consuming and labor-intensive site survey for radio map construction.

Recently, fingerprint crowdsourcing has been promoted to relieve or even eliminate the burdensome site survey by exploiting casually collected RSS samples [6,7,8,9]. Although not collected at specified locations, crowdsourced RSS samples still need to be annotated with some location information for fingerprint database construction. To this end, a common approach is to extract RSS samples from pedestrian movement trajectory [10,11,12]. As long as a trajectory can be correctly matched to one physical route, each step position can be obtained from the floor plan to annotate the corresponding step RSS sample.

Although fingerprint crowdsourcing seems a promising approach, care must be taken to deal with the samples with erroneously annotated locations. Compared with the site survey, such erroneous location annotation of crowdsourced samples could lead to an inaccurate radio map and degrade the performance of fingerprinting-based localization. Besides annotation errors, another challenge lies in that crowdsourced samples may not be uniformly distributed in the whole environment.

In this paper, we study the indoor localization through constructing radio propagation surfaces from crowdsourced samples. For each AP, its surface takes a location as input and outputs an estimated RSS of this location. To deal with annotation errors, we propose a cross-domain cluster intersection algorithm to assign each sample a reliability weight, which exploits the sample clustering results from both the physical and signal space. We next select a subset of weighted samples to fit each AP a surface from polynomial primary functions and construct subarea fingerprints by sampling AP surfaces. Furthermore, we compute two weights for each AP surface for describing its subarea consistency and location discrimination capability in online positioning. A two-step positioning algorithm is proposed to first determine the belonging subarea for a test sample, and then a weighted surface search is exploited to estimate its location within the subarea. We conducted field measurements and experiments. Compared with the peer schemes, results validate the effectiveness and robustness of the proposed scheme in terms of its lower localization error when facing sample annotation error and nonuniform density challenges.

The rest of the paper is organized as follows. Section 2 briefly reviews the most related work as well as the proposed system. The proposed offline surface fitting algorithm is presented in Section 3. Section 4 presents our online localization algorithm. Field measures are used for experiments and the results are provided in Section 5. Finally, Section 6 concludes the paper.

2. Related Work and System Overview

2.1. Related Work

Several fingerprinting systems based on sample crowdsourcing have been proposed for indoor localization in previous studies [13,14,15,16,17,18]. For example, Chen and Wang [13] proposed using a density-based clustering technique to group crowdsourced samples to generate a cluster fingerprint and using a matching algorithm to assign each cluster fingerprint to one subarea for room-level localization. Liu et al. [14] also applied crowdsourced samples for room-level localization yet with an improved energy-efficient sampling approach. Chang et al. [15] applied a local Gaussian process to construct grid fingerprints from crowdsourced samples. Jung et al. [16] adopted a hybrid global-local optimization scheme to determine the location of fingerprint sequences based on the constraint of the indoor structure, rather than using labeled fingerprints. They also proposed an unsupervised learning method to calibrate the localization model.

In the literature, many have proposed to extract crowdsourced samples from pedestrian trajectories. The core idea is to match a trajectory to one physical route such that each sample on a trajectory can be labeled with one location in the route [19,20,21,22,23,24,25]. For example, Kim et al. [19] combined the lightweight site survey and fingerprint crowdsourcing for radio map construction. They first constructed an initial radio map according to the lightweight site survey and use the pedestrian dead reckoning (PDR) to match the the war-walking paths into the radio map. Huang et al. [20] exploited layout landmarks such as the cross points of corridors for matching pedestrian trajectories to physical routes. Zhou et al. [23] proposed to transform the indoor layout into a semantic graph to map with activity sequences contained within the trajectories. Zhou et al. [25] applied a density-based spatial clustering algorithm to determine hotspots which are then mapped to physical subareas.

For crowdsourced samples, the conventional approach is to construct a radio map for grid fingerprints. In Ref. [26], Wang et al. proposed using polynomial functionals to fit a propagation surface for each AP based on a few reference fingerprints with correct location annotations. Ye and Wang [27] applied the surfacing method to deal with the problem of non-uniformly distributed crowdsourced samples, with the objective of composing grid fingerprints for radio map construction. Unlike these approaches, this paper proposes to exploit crowdsourced samples for fitting radio propagation surfaces. As crowdsourced samples normally have inaccurate location labels, how to construct a reliable surface is rather challenging. In this paper, we propose a sample weighting algorithm and apply weighted samples to fitting surfaces.

2.2. System Overview

We divided an indoor environment into several distinct subareas, such as rooms, corridors, etc., according to their functional layout by inherent obstructions and partitions such as concrete walls. We assumed that each crowdsourced sample has been annotated with some location, though possibly with annotation errors. We attributed each sample to one subarea according to its annotated location. The proposed system also consists of the offline and online phases.

The offline phase consists of four steps: Weighting crowdsourced samples assigns each crowdsourced sample a reliability weight based on our proposed cross-domain cluster intersection algorithm. Fitting radio surfaces constructs a radio propagation surface for each AP based on the weighted samples. Weighting fitted surfaces further assigns each fitted surface with two weights for discriminating their contributions for online localization. Constructing subarea fingerprints creates an RSS fingerprint for each subarea from its fitted and weighted surfaces.

The online localization consists of two steps: Subarea determination first locates an online test fingerprint into one subarea according to our proposed weighted signal distance. Location search searches the coordinate for the test fingerprint based on the gradient search on the constructed surfaces. Figure 1 presents the main flowchart of the proposed system, and Table 1 lists the symbols used in this paper as well as their notations.

3. The Offline Weighted Surfacing Algorithm

3.1. Weighting Crowdsourced Samples

In one subarea, e.g., a room, let

S = {s_{i}, \dots, s_{M}}

denote its set of M crowdsourced samples. A sample

s_{i} = ({\vec{l}}_{i}, {\vec{r}}_{i})

consists of two parts:

{\vec{l}}_{i} = (x_{i}, y_{i})

is its annotated location; and

{\vec{r}}_{i} = (r_{i 1}, r_{i 2}, \dots r_{i N})

the received RSS vector where N is the maximum number of hearable APs in one subarea. For one sample

s_{i}

, it is possible that not all the N APs could be heard, that is, some

r_{i j}

(

j < N

) might not be available in

s_{i}

. In this case, to allow the clustering and surfacing algorithm to run normally, we simply set it to a very small RSS value,

r_{m i n} = - 90

dBm, which is the lower bound of the collected signal strength, during the following sample clustering and weighting process.

A crowdsourced sample

s_{i}

may not be reliable in that its annotated location

{\vec{l}}_{i}

, RSS measurement

{\vec{r}}_{i}

, or both might have some errors. However, among a large number of such samples, we conjecture that some statistical relations could be extracted from the similarities between the physical and signal space. Consider the following example of two samples

s_{i}

and

s_{j}

. Let

d_{i j}^{p} ≜ ∥ {\vec{l}}_{i} - {\vec{l}}_{j} ∥

and

d_{i j}^{s} ≜ ∥ {\vec{r}}_{i} - {\vec{r}}_{j} ∥

denote the distance between the two samples in the physical space and signal space, respectively. Suppose that

d_{i j}^{p}

is small, indicating that

s_{i}

and

s_{j}

are close to each other according to their annotated locations. For a small

d_{i j}^{s}

, we could conjecture that both samples are reliable or both samples are unreliable. Although we could not determine which is the real case for only two samples, we might be able to infer the statistic relations from a large number of samples to discriminate unreliable samples. Motivated from such considerations, we next present a cross-domain cluster intersection (CCI) algorithm to assign each sample a reliability weight.

In both the physical and signal space, we group all samples

s_{i} \in S

into K clusters by the classic K-means clustering algorithm. Let

C^{p} = {C_{1}^{p}, \dots, C_{K}^{p}}

and

C^{s} = {C_{1}^{s}, \dots, C_{K}^{s}}

denote the set of clusters in the physical and signal space, respectively. Notice that a sample

s_{i}

is within one of the clusters in

C^{p}

and

C^{s}

simultaneously. We define a cross-domain cluster coefficient for such a sample

s_{i}

based on the cluster intersection between

C_{a}^{p}

and

C_{b}^{s}

as follows:

γ_{i} = \frac{| C_{a}^{p} ⋂ C_{b}^{s} |^{2}}{| C_{a}^{p} | \times | C_{b}^{s} |} .

(1)

If

C_{a}^{p} = C_{b}^{s}

, i.e., the two clusters contain the same set of samples, then all such samples have the same coefficient and

γ_{i} = 1

. According to the K-means clustering, all samples in

C_{a}^{p}

are closer to this cluster center than to other cluster centers. This is also the case for samples in

C_{b}^{s}

in the signal space in terms of their RSS vector similarities. Therefore,

| C_{a}^{p} ⋂ C_{b}^{s} |

describes how many samples are close to each other in both the physical and signal space. A small value of

γ_{i}

indicates that

s_{i}

is not similar to the majority of the two clusters, which might suggest its unreliability. As the surface fitting is done in the signal space, we further normalize

γ_{i}

to assign the sample weight based on the signal space clusters. For each sample

s_{i} \in C_{b}^{s}

, we compute its reliability weight by

ω_{i} = \frac{γ_{i}}{max {γ_{j} | s_{j} \in C_{b}^{s}}},

(2)

where the denominator is the maximum cross-domain cluster coefficient of the samples in the cluster. Figure 2 illustrates the CCI algorithm and computes reliability weights for some samples.

3.2. Fitting Radio Surfaces

In one subarea, we construct a radio propagation surface for each hearable AP based on the weighted samples

(s_{i}, w_{i})

. A surface function

ϕ (x, y)

takes a location as its input and outputs an estimated RSS at this location. Notice that the number of crowdsourced samples could be large and keep increasing. To reduce computational complexity and alleviate surface overfitting, we propose a percentile weight partition (PWP) method to select only a subset of weighted samples.

Define

p_{t h}

and

ω_{t h}

as the percentile and weight threshold, respectively, and

p_{t h}, ω_{t h} \in [0, 1]

. The objective is to ensure that more than

p_{t h}

samples have weights larger than

ω_{t h}

. We first sort samples according to their weights in an increasing order, denoted by

\vec{ω}

. Let

ω_{k}

denote the kth sample whose weight is at the

p_{t h}

percentile of

\vec{ω}

. If

ω_{k} \geq ω_{t h}

, then no samples will be removed. Otherwise, we remove the first

⌈ \frac{M - k ω_{t h}}{1 - ω_{t h}} ⌉

samples from

\vec{w}

. After the sample selection, let

S^{'}

denote the set of select samples and let

A

denote the set of hearable APs by samples in

S^{'}

.

In this paper, we adopt a polynomial function to fit a radio propagation surface for each AP in

A

as follows:

ϕ_{n} (x, y) = \sum_{i = 1}^{p} \sum_{j = 1}^{q} a_{i j} x^{i - 1} y^{j - 1}, for all n \in A,

(3)

where

a_{i j}

s are fitting coefficients. The objective of weighted surface fitting is to

minimize H \equiv \sum_{i = 1}^{| S^{'} |} ω_{i}^{2} {(ϕ_{n} (x_{i}, y_{i}) - r_{i n})}^{2}

(4)

To compute one fitting coefficient

a_{e r}

, we equate its partial derivative to zero to minimize H.

\begin{matrix} \frac{\partial H}{\partial a_{e r}} & = \frac{\partial}{\partial a_{e r}} \sum_{i = 1}^{n} ω_{i}^{2} {[ϕ_{n} (x_{i}, y_{i}) - r_{i n}]}^{2} \\ = \sum_{i = 1}^{n} \{2 ω_{i}^{2} [ϕ_{n} (x_{i}, y_{i}) - r_{i n}] \frac{\partial}{\partial a_{e r}} [ϕ (x_{i}, y_{i})]\} \\ = \sum_{i = 1}^{n} \{2 ω_{i}^{2} [ϕ_{n} (x_{i}, y_{i}) - r_{i n}] x_{i}^{e - 1} y_{i}^{r - 1}\} \\ = 0 \end{matrix}

(5)

From the equation above, we can derive

\begin{matrix} \sum_{i = 1}^{n} 2 ω_{i}^{2} x_{i}^{e - 1} y_{i}^{r - 1} ϕ_{n} (x_{i}, y_{i}) = \sum_{i = 1}^{n} 2 ω_{i}^{2} x_{i}^{e - 1} y_{i}^{r - 1} r_{i n} \end{matrix}

(6)

\begin{matrix} \sum_{i = 1}^{n} 2 ω_{i}^{2} x_{i}^{e - 1} y_{i}^{r - 1} \sum_{c = 1}^{p} \sum_{d = 1}^{q} a_{c d} x_{i}^{c - 1} y_{i}^{d - 1} = \sum_{i = 1}^{n} 2 ω_{i}^{2} x_{i}^{e - 1} y_{i}^{r - 1} r_{i n} \end{matrix}

(7)

We define

\begin{matrix} u_{c d} (e, r) & = \sum_{i = 1}^{n} (2 ω_{i}^{2} x_{i}^{c - 1} y_{i}^{d - 1} x_{i}^{e - 1} y_{i}^{r - 1}) \end{matrix}

(8)

\begin{matrix} v (e, r) & = \sum_{i = 1}^{n} 2 ω_{i}^{2} x_{i}^{e - 1} y_{i}^{r - 1} r_{i n} \end{matrix}

(9)

Thus, we can rewrite the equation as:

\sum_{c = 1}^{p} \sum_{d = 1}^{q} a_{c d} u_{c d} (e, r) = v (e, r), e = 1, \dots, p, r = 1, \dots, q

(10)

The matrix form of equation above is:

[\begin{matrix} u_{11} (1, 1) & \dots & u_{p q} (1, 1) \\ ⋮ & ⋱ & ⋮ \\ u_{11} (p, q) & \dots & u_{p q} (p, q) \end{matrix}] [\begin{matrix} a_{11} \\ ⋮ \\ a_{p q} \end{matrix}] = [\begin{matrix} v (1, 1) \\ ⋮ \\ v (p, q) \end{matrix}]

(11)

Then, by

A = U^{- 1} V

, the surface coefficient can be calculated.

3.3. Weighting Fitted Surfaces

Each AP surface is constructed based on its weighted samples. Different AP surfaces could contribute differently for describing the whole signal space. We next assign two weights to each AP surface via an entropy-like quantity computed from its samples: one is used for subarea determination and the other for location search in our online positioning.

For each AP in

A

, let

R = {r_{1}, \dots, r_{R}}

denote its set of RSS values extracted from the weighted samples in

S^{'}

. As the samples are assumed to be crowdsourced randomly from different locations, the set

R

is also expected to contain the RSS values from different locations. If all elements in

R

have similar values, then this AP might not be very helpful for discriminating different locations in one subarea. On the other hand, such an AP may be seen as a good indication of this subarea for its RSS consistency. Motivated by such considerations, we propose to weight AP surfaces for their different subarea consistencies and location discriminations from an entropy-like viewpoint.

We first normalize the elements in

R

by

{\bar{r}}_{i} = \frac{r_{i} - min (R)}{max (R) - min (R)}, for all r_{i} \in R .

(12)

We next compute an entropy-like quantity

η

for each AP in

A

to describe its RSS distribution property by

η = - \frac{\sum_{i = 1}^{R} p_{i} ln (p_{i})}{ln (R)}, where p_{i} = \frac{{\bar{r}}_{i}}{\sum_{j = 1}^{R} {\bar{r}}_{j}} .

(13)

For our two-step online positioning, we compute two surface weights for each AP:

ρ_{n}^{s u b} = \frac{η_{n}}{\sum_{j = 1}^{| A |} η_{j}}, ρ_{n}^{l o c} = \frac{1 - η_{n}}{\sum_{j = 1}^{| A |} (1 - η_{j})} .

(14)

ρ_{n}^{s u b}

is used in the subarea determination, while

ρ_{n}^{l o c}

is used in the location search in one subarea.

3.4. Constructing Subarea Fingerprints

For each subarea, we construct a subarea fingerprint

\vec{f}

based on its weighted surfaces

ϕ_{n}

(

n \in A

). We adopt a grid lattice approach to sample each surface

ϕ_{n}

uniformly in the physical space. Let

G

denote such a grid structure. For the gth grid, let

f_{g n} = ϕ_{n} (g_{x}, g_{y})

denote a sampled grid RSS value from the nth surface, where

(g_{x}, g_{y})

is the coordinate of the grid center. Then,

\vec{f}

consists of subarea-averaged RSS values for all hearable APs

\vec{f} = (\frac{1}{| G |} \sum_{g \in G} f_{g n}, \dots, \frac{1}{| G |} \sum_{g \in G} f_{g N^{'}}),

(15)

where

N^{'} = | A |

is the number of hearable APs in

A

.

4. The Online Positioning Algorithm

The online positioning consists of two phases: subarea determination and location search.

Subarea Determination: Let

{\vec{f}}_{t}

denote the RSS vector of a test sample, and

{\vec{f}}_{s}

the sth subarea fingerprint. Let

A_{i n t}

denote the set of hearable APs by both

{\vec{f}}_{t}

and

{\vec{f}}_{s}

. We compute the weighted signal distance between

{\vec{f}}_{t}

and

{\vec{f}}_{s}

as:

D_{s} = \frac{1}{| A_{i n t} |} \sqrt{\sum_{n \in | A_{i n t} |} {(ρ_{n}^{s u b} \times (f_{s n} - f_{t n}))}^{2}},

(16)

where

f_{s n}

and

f_{t n}

are the RSS values from the nth hearable AP in

{\vec{f}}_{s}

and

{\vec{f}}_{t}

, respectively. The test sample is then localized into a subarea with the minimum

D_{s}

.

Location Search: Assume that the sth subarea is selected in the first phase. We next search a space point

(\hat{x}, \hat{y})

in this subarea to minimize the weighted signal difference between

{\vec{f}}_{t}

and subarea surfaces:

(\hat{x}, \hat{y}) = arg min_{(x, y)} \sum_{n \in A_{i n t}} {[ρ_{n}^{l o c} (ϕ_{n} (x, y) - f_{t n})]}^{2}

(17)

In this paper, we use the gradient descent search method. Instead of randomly choosing a start point, we use the localization result of a simple nearest neighbor (NN) algorithm as the initial searching point, where the grid fingerprints are spatially sampled from the fitted surfaces. We then calculate weighted signal difference as the cost function and its partial derivation to determine the search direction. The cost function is defined as

J (l_{t}) = \sum_{n \in A_{i n t}} {[ρ_{n}^{l o c} (ϕ_{n} (x, y) - f_{t n})]}^{2}

(18)

The search iteration is defined by

\begin{matrix} l_{t + 1} = l_{t} + α_{d} d_{t}, where d_{t} = - \nabla J (l_{t}) \end{matrix}

(19)

\begin{matrix} \nabla J (l_{t}) = {[\frac{\partial J (l_{t})}{\partial x}, \frac{\partial J (l_{t})}{\partial y}]}^{T}, \end{matrix}

(20)

where

α_{d}

is the search step. We substitute Equation (3) into Equation (18):

J (l_{t}) = \sum_{n \in A_{i n t}} {[ρ_{n}^{l o c} (\sum_{i = 1}^{p} \sum_{j = 1}^{q} a_{i j} x^{i - 1} y^{j - 1} - f_{t n})]}^{2}

(21)

Next, we compute the partial derivation of this cost function to gain the gradient and update the search iteration.

\begin{matrix} \frac{\partial J (l_{t})}{\partial x} = & \sum_{n \in A_{i n t}} 2 {(ρ_{n}^{l o c})}^{2} [ϕ_{n} (x, y) - f_{t n}^{0}] \sum_{i = 1}^{p} \sum_{j = 1}^{q} a_{i j} (i - 1) \\ x^{j - 2} y^{j - 1} \end{matrix}

(22)

\begin{matrix} \frac{\partial J (l_{t})}{\partial y} = & \sum_{n \in A_{i n t}} 2 {(ρ_{n}^{l o c})}^{2} [ϕ_{n} (x, y) - f_{t n}^{0}] \sum_{i = 1}^{p} \sum_{j = 1}^{q} a_{i j} x^{i - 1} \\ (j - 1) y^{j - 2} \end{matrix}

(23)

The gradient search will stop when the

d_{t}

is too small to update the search position for the next iteration.

5. Field Measurements and Experiments

5.1. Experiment Settings

Figure 3 plots the indoor layout of our field measurements in a typical lecture building with total area of 482 m

^{2}

. In our work, we did not place our own APs. Instead, we employed the existing Wi-Fi infrastructure with APs deployed by different parties, such as individual laboratories, telecom operators and campus authorities. Indeed, the total number of hearable AP in our experimental environment was more than 400, while, for each sample, normally >70 APs could be heard. We note that emplying the existing Wi-Fi infrastructure makes our proposed scheme ready to be implemented in many practical scenarios.

A Huawei Honor 3C smartphone was used to collect RSS samples. We conducted two batches of sample collection: The first batch

S_{s i t e}

was based on the site survey approach, containing in total 13,670 samples each collected at one grid center. The second batch

S_{w a l k}

, containing in total 13,370 samples, was extracted from movement trajectories restricted to only those walkable routes, as illustrated by the colored area in one room in Figure 3. Note that the samples in both

S_{s i t e}

and

S_{w a l k}

are firstly annotated true location information at collection. To emulate annotation errors, we again annotate each sample into a new location with a location offset randomly drawn from a Gaussian distribution with zero mean and

σ

standard deviation. The test set

S_{t e s t}

contains 5600 samples uniformly distributed in the whole environment.

Experiment Schemes: According to their annotated locations, crowdsourced samples can be assigned into different grids to construct grid fingerprints. Similarly, they can also be grouped into different clusters in the signal space to obtain cluster fingerprints. We tested the following peer localization schemes to examine these typical approaches.

FGrid emulates the traditional site-survey fingerprinting based on grid fingerprints, which divides the subarea into several non-overlapping grid cell to contain samples, and assigns each new sample into its nearest grid cell. For each grid cell, a grid fingerprint is composed by averaging all samples located within the grid cell, and the location of the grid fingerprint is annotated as the grid center. In the online phase, we used the nearest neighbor algorithm.
SGrid is similar to the FGrid to obtain grid fingerprints. We then constructed surfaces based on these fingerprints in the offline phase. In the online phase, we used the same surface search method as the one in our proposed SWSample.
SRaw retains the original position of every crowdsourced sample and fits propagation surfaces based on them. In the online phase, we used the same surface search method as the one in our proposed SWSample.
SCluster clusters the samples in signal domain only. For each cluster, we obtained a cluster fingerprint, which is the average of its cluster members’ RSS vectors. The location of a cluster fingerprint is the geometric center of the cluster members. We fitted the propagation surfaces for every AP based on these cluster fingerprints. In the online phase, we used the surface search method the same as the one in our proposed SWSample.
SWSample is the proposed scheme.

In all the above schemes, we set the cluster number equal to the number of grids used in FGrid for a fair comparison. We also adopted the proposed two-step online positioning algorithm. We noticed that, from our experiments, the subarea hitting rate of all these schemes is not smaller than 99.58%, i.e., almost all test samples can be correctly determined to its belonging subarea. Thus, we do not report this result again in the following.

5.2. Surface Fitting Examples

Figure 4, Figure 5, Figure 6 and Figure 7 plot the fitted surfaces for the four surfacing-based schemes. We chose one AP for Room A and fit its surface from 1800 samples randomly drawn from

S_{s i t e} ⋃ S_{w a l k}

. For the SWSample fitted surface in Figure 7, we also color the sample weight as shown by the weight color spectrum alongside the graph. It can be seen that the proposed scheme could produce a smoother surface, compared with other schemes. If we assume that this AP is located at the coordinate around the highest RSS value, then we could observe that the surface in Figure 7 is more like an attenuated sphere centered at the AP. The Keenan–Motley path loss model has been widely adopted to characterize the radio propagation in mobile cellular networks. If such a model could still be applicable in a small and open space such as a room, then our fitted surface resembles the most to this model, which might also help to explain the effectiveness of our weighted surface fitting.

5.3. Experiment Results

Uniformly distributed samples: We first considered the scenario that all crowdsourced samples are uniformly distributed in the experiment environment, that is, we used crowdsourced samples from

S_{s i t e} ⋃ S_{w a l k}

. Figure 8 plots the average localization error (ALE) against the number of crowdsourced samples randomly drawn from

S_{s i t e} ⋃ S_{w a l k}

. It was first observed that all the surfacing schemes outperform the grid fingerprinting FGrid, which validates the effectiveness of using fitted radio propagation surfaces for localization. When the number of samples increases, from about

0.33 M_{g}

to

20 M_{g}

with

M_{g}

the number of total grid cells, the ALE of the surfacing schemes first decreases and then increases. At first, the number of samples is not large enough to well fit actual surfaces. In this case, our scheme SWSample has a slightly higher ALE than other surfacing schemes (see the first two points in Figure 8) due to its sample selection. On the other hand, if the noisy samples are too many, the surfaces may be overfitted for unreliable samples. However, ours presents a decent degradation and the ALE of using all 27,040 samples is 1.54 m, slightly higher than the best case of 1.45 m of using 3605 samples. The positioning accuracy improvement of our scheme are

36.71 %

over FGrid and

9.41 %

over SRaw, respectively. Compared with the SRaw scheme, the improvement can be attributed to our sample weighting and selection algorithm, which only chooses those reliable samples for weighted surface fitting, leading to a more accurate radio map and better positioning results.

Figure 9 presents the ALE against the standard deviation

σ

of location offset. Notice that

σ = 0

indicating no annotation errors. It is not unexpected to see that all schemes suffer from the increasing of

σ

, i.e., the annotated locations farther away from true locations. However, our scheme SWSample still performs the best. Figure 10 plots the cumulative distribution function (CDF) of localization error. It is worth noting that, besides a low median localization error of 1.51 m, our SWSample has a low 90% percentile error of only 2.64 m. To provide the exact numbers, Table 2 summarizes the localization error results for three situations, namely,

σ = 0.6

m,

σ = 0.9

m, and

σ = 1.2

m, respectively.

Non-uniformly distributed samples: It is also often the case that crowdsourced samples are not uniformly distributed in the whole environment. To examine this nonuniform density issue, we only use the samples from

S_{w a l k}

to fit surfaces. That is, the subregion of chairs and desks in each room do not contain crowdsourced samples. However, as we intentionally include location annotation errors, some samples may still be annotated to locations within such a vacant subregion. As shown in Figure 11 and Figure 12, it is not unexpected to observe that all schemes suffer from such a nonuniform density situation, comparing with the results in Figure 8. However, our SWSample scheme can still outperform other schemes in most of cases. The positioning accuracy improvements are

36.85 %

over FGrid and

18.79 %

over SRaw, respectively. Furthermore, the median and 90% localization errors in Figure 13 are 2.04 m and 3.24 m, respectively, which are comparable to the uniform density case. Table 2 summarizes the localization error results from three situations for non-uniformly distributed samples. It can be observed that our proposed scheme has great potential to obtain a better result in this non-uniformly distributed case, which illustrates its robustness for tackling the nonuniform density challenge.

6. Concluding Remarks

This paper has studied the problem of constructing radio propagation surfaces from unreliable crowdsourced samples with annotation errors. We have proposed a cross-domain cluster intersection to weight each sample reliability and an entropy-like approach to further weight the constructed surfaces. Field experiments have validated its effectiveness and robustness for dealing with the nonuniform density challenge. Our proposed method contributes to indoor localization society in its high accuracy and easy implementation.

We close this paper with some discussions about future work. This paper has applied polynomial functions for fitting radio propagation surfaces in the offline phase. Indeed, the propagation surfaces may take different forms and there could exist many other primary functions or stochastic kernels for surface fitting. How to intelligently choose the most suitable primary functions or stochastic kernels and automatically adjust their fitting parameters are worthy of further research. In this paper, we have applied the commonly used deterministic positioning algorithm in the online phase. Using some probabilistic positioning algorithms, especially when the radio propagation surfaces are modelled as stochastic processes, is also worthy of further investigation.

Author Contributions

B.W. provided problem conceptualization and revised the paper; B.W. and J.L. investigated the problem solutions and proposed the algorithm; J.L. implemented the algorithm and drafted the paper; J.L. and G.Y. experimented the algorithm and analyzed the results; M.Z. reviewed and revised the paper.

Funding

This work is partly supported by the National Natural Science Foundation of China with grant number 61771209 and the Fundamental Research Funds for the Central Universities with grant number 2018KFYYXJJ136.

Conflicts of Interest

The authors declare no conflict of interest.

References

Liu, H.; Darabi, H.; Banerjee, P.; Liu, J. Survey of wireless indoor positioning techniques and systems. IEEE Trans. Syst. Man Cybern. C 2007, 37, 1067–1080. [Google Scholar] [CrossRef]
Yassin, A.; Nasser, Y.; Awad, M.; Al-Dubai, A.; Liu, R.; Yuen, C.; Raulefs, R.; Aboutanios, E. Recent advances in indoor localization: A survey on theoretical approaches and applications. IEEE Commun. Surv. Tutor. 2016, 19, 1327–1346. [Google Scholar] [CrossRef]
He, S.; Chan, S.-H.G. Wi-Fi fingerprint-based indoor positioning: Recent advances and comparisons. IEEE Commun. Surv. Tutor. 2016, 18, 466–490. [Google Scholar] [CrossRef]
Wang, B.; Zhou, S.; Liu, W.; Mo, Y. Indoor localization based on curve fitting and location search using received signal strength. IEEE Trans. Ind. Electron. 2015, 62, 572–582. [Google Scholar] [CrossRef]
Bahl, P.; Padmanabhan, V.N. Radar: An in-building RF-based user location and tracking system. In Proceedings of the IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064), Tel Aviv, Israel, 26–30 March 2000; Volume 2, pp. 775–784. [Google Scholar]
Hossain, A.; Soh, W.-S. A survey of calibration-free indoor positioning systems. Comput. Commun. 2015, 66, 1–13. [Google Scholar] [CrossRef]
Wang, B.; Chen, Q.; Yang, L.T.; Chao, H.-C. Indoor smartphone localization via fingerprint crowdsourcing: Challenges and approaches. IEEE Wirel. Commun. 2016, 23, 82–89. [Google Scholar] [CrossRef]
Zhou, X.; Chen, T.; Guo, D.; Teng, X.; Yuan, B. From one to crowd: A survey on crowdsourcing-based wireless indoor localization. Front. Comput. Sci. 2018, 12, 423–450. [Google Scholar] [CrossRef]
He, S.; Ji, B.; Chan, S.-H.G. Chameleon: Survey-free updating of a fingerprint database for indoor localization. IEEE Pervasive Comput. 2016, 15, 66–75. [Google Scholar] [CrossRef]
Abdelnasser, H.; Mohamed, R.; Elgohary, A.; Alzantot, M.F.; Wang, H.; Sen, S.; Choudhury, R.R.; Youssef, M. Semanticslam: Using environment landmarks for unsupervised indoor localization. IEEE Trans. Mob. Comput. 2016, 15, 1770–1782. [Google Scholar] [CrossRef]
Zhou, M.; Zhang, Q.; Wang, Y.; Tian, Z. Hotspot ranking based indoor mapping and mobility analysis using crowdsourced Wi-Fi signal. IEEE Access 2017, 5, 3594–3602. [Google Scholar] [CrossRef]
Wu, C.; Yang, Z.; Xiao, C. Automatic radio map adaptation for indoor localization using smartphones. IEEE Trans. Mob. Comput. 2018, 17, 517–528. [Google Scholar] [CrossRef]
Chen, Q.; Wang, B. Finccm: Fingerprint crowdsourcing, clustering and matching for indoor subarea localization. IEEE Wirel. Commun. Lett. 2015, 4, 677–680. [Google Scholar] [CrossRef]
Liu, X.; Zhan, Y.; Cen, J. An energy-efficient crowd-sourcing-based indoor automatic localization system. IEEE Sens. J. 2018, 18, 6009–6022. [Google Scholar] [CrossRef]
Chang, Q.; Li, Q.; Shi, Z.; Chen, W.; Wang, W. Scalable indoor localization via mobile crowdsourcing and gaussian process. Sensors 2016, 16, 381. [Google Scholar] [CrossRef] [PubMed]
Jung, S.; Moon, B.; Han, D. Unsupervised learning for crowdsourced indoor localization in wireless networks. IEEE Trans. Mob. Comput. 2016, 15, 2892–2906. [Google Scholar] [CrossRef]
Zhou, M.; Tang, Y.; Tian, Z.; Geng, X. Semi-supervised learning for indoor hybrid fingerprint database calibration with low effort. IEEE Access 2017, 5, 4388–4400. [Google Scholar] [CrossRef]
Jung, S.; Han, H. Automated construction and maintenance of Wi-Fi radio maps for crowdsourcing-based indoor positioning systems. IEEE Access 2017, 6, 1764–1777. [Google Scholar] [CrossRef]
Kim, Y.; Shin, H.; Chon, Y.; Cha, H. Crowdsensing-based Wi-Fi radio map management using a lightweight site survey. Comput. Commun. 2015, 60, 86–96. [Google Scholar] [CrossRef]
Huang, Z.; Xia, J.; Yu, H.; Guan, Y.; Gan, X.; Liu, J. Fusing fixed and hint landmarks on crowd paths for automatically constructing Wi-Fi fingerprint database. China Commun. 2015, 12, 11–24. [Google Scholar] [CrossRef]
Zhou, B.; Li, Q.; Mao, Q.; Tu, W.; Zhang, X.; Chen, L. Alimc: Activity landmark-based indoor mapping via crowdsourcing. IEEE Trans. Intell. Transp. Syst. 2015, 16, 2774–2785. [Google Scholar] [CrossRef]
Yu, N.; Xiao, C.; Wu, Y.; Feng, R. A radio-map automatic construction algorithm based on crowdsourcing. Sensors 2016, 16, 504. [Google Scholar] [CrossRef] [PubMed]
Zhou, B.; Li, Q.; Mao, Q.; Tu, W. A robust crowdsourcing-based indoor localization system. Sensors 2017, 17, 864. [Google Scholar] [CrossRef] [PubMed]
Li, W.; Wei, D.; Lai, Q.; Li, X.; Yuan, H. Geomagnetism-aided indoor Wi-Fi radio-map construction via smartphone crowdsourcing. Sensors 2018, 18, 1462. [Google Scholar] [CrossRef] [PubMed]
Zhou, M.; Wang, Y.; Tian, Z.; Zhang, Q. Indoor pedestrian motion detection via spatial clustering and mapping. IEEE Sens. Lett. 2018, 2, 1–4. [Google Scholar] [CrossRef]
Wang, B.; Zhou, S.; Yang, L.T.; Mo, Y. Indoor positioning via subarea fingerprinting and surface fitting with received signal strength. Pervasive Mob. Comput. 2015, 23, 43–58. [Google Scholar] [CrossRef]
Ye, Y.; Wang, B. RMapCS: Radio map construction from crowdsourced samples for indoor localization. IEEE Access 2018, 6, 24224–24238. [Google Scholar] [CrossRef]

Figure 1. The flowchart of the proposed system: In the offline phase, crowdsourced samples are each weighted according to our algorithm. For each access point and for one subarea, its radio propagation surface is firstly fitted and also weighted from those selected and weighted samples. Subarea fingerprints are then composed from fitted surfaces. In the online phase, a test sample is first compared with subarea fingerprints to determine its belonging subarea, and then a gradient search is used to estimate its exact location.

Figure 2. Illustration of the cross-domain cluster intersection algorithm: In the physical space, samples are clustered according to their annotated coordinates. In the signal space, samples are clustered according to the RSS distances. The weight of a sample is determined by the common samples between its belonged physical cluster and signal cluster.

Figure 3. The layout of the indoor environment. A grid lattice has been used to collect samples, with in total 1368 grid cells each with size

0.6 \times 0.6

m

^{2}

. Besides, pedestrian trajectories have also been used to collect samples for the corridor and walkable pathways in each room.

Figure 3. The layout of the indoor environment. A grid lattice has been used to collect samples, with in total 1368 grid cells each with size

0.6 \times 0.6

m

^{2}

. Besides, pedestrian trajectories have also been used to collect samples for the corridor and walkable pathways in each room.

Figure 4. Illustration of fitted surface by SGrid. We choose one AP for Room A and fit its surface from 1800 samples randomly drawn from

S_{s i t e} ⋃ S_{w a l k}

. Crowdsourced samples are assigned to grid cells. A grid fingerprint is composed by averaging all samples in the grid cell, and its location is the grid center. The fitted surface is based on the grid fingerprints.

Figure 4. Illustration of fitted surface by SGrid. We choose one AP for Room A and fit its surface from 1800 samples randomly drawn from

S_{s i t e} ⋃ S_{w a l k}

. Crowdsourced samples are assigned to grid cells. A grid fingerprint is composed by averaging all samples in the grid cell, and its location is the grid center. The fitted surface is based on the grid fingerprints.

Figure 5. Illustration of fitted surface by SRaw. We choose one AP for Room A and fit its surface from 1800 samples randomly drawn from

S_{s i t e} ⋃ S_{w a l k}

. All crowdsourced samples are used for surface fitting, without sample weighting and selection.

Figure 5. Illustration of fitted surface by SRaw. We choose one AP for Room A and fit its surface from 1800 samples randomly drawn from

S_{s i t e} ⋃ S_{w a l k}

. All crowdsourced samples are used for surface fitting, without sample weighting and selection.

Figure 6. Illustration of fitted surface by SCluster. We choose one AP for Room A and fit its surface from 1800 samples randomly drawn from

S_{s i t e} ⋃ S_{w a l k}

. All crowdsourced samples are first clustered in the signal space. For each cluster, a cluster fingerprint is composed by averaging the RSS vectors of its cluster members, and its location is the geometric center of the cluster members. The fitted surface is based on the cluster fingerprints.

Figure 6. Illustration of fitted surface by SCluster. We choose one AP for Room A and fit its surface from 1800 samples randomly drawn from

S_{s i t e} ⋃ S_{w a l k}

. All crowdsourced samples are first clustered in the signal space. For each cluster, a cluster fingerprint is composed by averaging the RSS vectors of its cluster members, and its location is the geometric center of the cluster members. The fitted surface is based on the cluster fingerprints.

Figure 7. Illustration of fitted surface by our proposed SWSample. We choose one AP for Room A and fit its surface from 1800 samples randomly drawn from

S_{s i t e} ⋃ S_{w a l k}

. Crowdsourced samples are weighted and selected for surface construction. The sample weight is illustrated by the dot color in the figure.

Figure 7. Illustration of fitted surface by our proposed SWSample. We choose one AP for Room A and fit its surface from 1800 samples randomly drawn from

S_{s i t e} ⋃ S_{w a l k}

. Crowdsourced samples are weighted and selected for surface construction. The sample weight is illustrated by the dot color in the figure.

Figure 8. Comparison of localization performance. The average localization error (ALE) vs. the number of crowdsourced samples

M_{a l l}

, when using crowdsourced samples from

S_{s i t e} ⋃ S_{w a l k}

. The standard deviation of location offset

σ = 1.2

m.

Figure 8. Comparison of localization performance. The average localization error (ALE) vs. the number of crowdsourced samples

M_{a l l}

, when using crowdsourced samples from

S_{s i t e} ⋃ S_{w a l k}

. The standard deviation of location offset

σ = 1.2

m.

Figure 9. Comparison of localization performance. The average localization error (ALE) vs. the standard deviation

σ

of location offset, where

M_{a l l} = 27, 040

.

Figure 9. Comparison of localization performance. The average localization error (ALE) vs. the standard deviation

σ

of location offset, where

M_{a l l} = 27, 040

.

Figure 10. Comparison of cumulative distribution function (CDF) localization error, where

M_{a l l}

= 27,040 and

σ = 1.2

m.

Figure 10. Comparison of cumulative distribution function (CDF) localization error, where

M_{a l l}

= 27,040 and

σ = 1.2

m.

Figure 11. Comparison of localization performance, when using crowdourced samples only from

S_{w a l k}

. The average localization error (ALE) vs. the number of crowdsourced samples

M_{a l l}

, where

σ = 1.2

m.

Figure 11. Comparison of localization performance, when using crowdourced samples only from

S_{w a l k}

. The average localization error (ALE) vs. the number of crowdsourced samples

M_{a l l}

, where

σ = 1.2

m.

Figure 12. Comparison of localization performance, when using crowdsourced samples only from

S_{w a l k}

. The average localization error (ALE) vs. the standard deviation

σ

of location offset, where

M_{a l l} = 4456

.

Figure 12. Comparison of localization performance, when using crowdsourced samples only from

S_{w a l k}

. The average localization error (ALE) vs. the standard deviation

σ

of location offset, where

M_{a l l} = 4456

.

Figure 13. Comparison of cumulative distribution function (CDF) localization error with

M_{a l l} = 4456

and

σ = 1.2

m, when using crowdourced samples only from

S_{w a l k}

.

Figure 13. Comparison of cumulative distribution function (CDF) localization error with

M_{a l l} = 4456

and

σ = 1.2

m, when using crowdourced samples only from

S_{w a l k}

.

Table 1. Table of symbols.

Symbol	Definition
$S$	A set of crowdsourced samples in one subarea.
M	The number of crowdsourced samples in $S$ , $M = \| S \|$ .
$s_{i}$	The ith crowdsourced sample in $S$ .
${\vec{l}}_{i}$	The annotated location of the ith crowdsourced sample.
${\vec{r}}_{i}$	The RSS vector of the ith crowdsourced sample.
N	The maximum number of hearable AP in $S$ .
K	The number of clusters.
$C^{p}$	The set of clusters in the physical space.
$C^{s}$	The set of clusters in the signal space.
$γ_{i}$	The cross-domain cluster coefficient of the ith sample.
$ω_{i}$	The reliability weight of the ith sample.
$ϕ (x, y)$	The RSS surface function.
$p_{t h}$	The percentile threshold in sample selection method.
$ω_{t h}$	The weight threshold in sample selection method.
$\vec{ω}$	The increasing order of sample reliability weight.
$ω_{k}$	The reliability weight at the $p_{t h}$ percentile in $\vec{ω}$ .
$S^{'}$	The set of select samples.
$A$	The set of hearable Aps by samples in $S^{'}$ .
$α_{i j}$	The surface coefficient of the RSS surface function.
$R$	The set of RSS values from an AP in $S^{'}$ .
${\bar{r}}_{i}$	The normalized elements in $R$ .
$η$	The entropy-like quantity for each AP in $A$ .
$ρ_{n}^{s u b}$	The surface weight of nth AP in $A$ for subarea determination.
$ρ_{n}^{l o c}$	The surface weight of nth AP in $A$ for location search.
$\vec{f}$	Subarea fingerprint.
$G$	The set of grid cells in one subarea.
G	The number of grids in $G$ , $G = \| G \|$ .
${\vec{f}}_{t}$	The RSS vector of a test sample.
${\vec{f}}_{s}$	The sth subarea fingerprint.
$A_{i n t}$	The set of hearable APs by both ${\vec{f}}_{t}$ and ${\vec{f}}_{s}$ .
$D_{s}$	The weighted signal distance between the test sample and a subarea.
$M_{g}$	The number of grid cells.
$σ$	The standard deviation of location offset.
$S_{s i t e}$	The set of samples from site survey.
$S_{w a l k}$	The set of samples from pedestrian trajectories.

Table 2. Comparison of mean, 50% and 90% localization error.

Error (m)		$σ = 0$ m			$σ = 0.6$ m			$σ = 1.2$ m
Error (m)		Mean	50%	90%	Mean	50%	90%	Mean	50%	90%
Uni.	FGrid	2.479	2.448	3.672	2.284	2.086	3.744	2.421	2.217	3.868
	SGrid	1.571	1.353	2.595	1.726	1.630	2.884	1.898	1.757	3.048
	SRaw	1.575	1.370	2.645	1.618	1.524	2.694	1.711	1.688	2.873
	SCluster	1.552	1.364	2.550	1.708	1.657	2.875	1.916	1.879	3.111
	SWSample	1.373	1.124	2.413	1.374	1.243	2.470	1.513	1.366	2.640
Non-uni.	FGrid	2.897	2.776	3.672	2.982	2.813	4.477	3.059	2.932	4.502
	SGrid	2.164	1.691	3.522	2.086	1.679	3.402	2.169	1.795	3.499
	SRaw	2.155	1.713	3.459	2.221	1.732	3.594	2.322	1.898	3.647
	SCluster	2.063	1.602	3.497	2.009	1.584	3.287	2.144	1.752	3.477
	SWSample	1.854	1.497	3.172	1.951	1.472	3.217	2.043	1.625	3.242

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lin, J.; Wang, B.; Yang, G.; Zhou, M. Indoor Localization Based on Weighted Surfacing from Crowdsourced Samples. Sensors 2018, 18, 2990. https://doi.org/10.3390/s18092990

AMA Style

Lin J, Wang B, Yang G, Zhou M. Indoor Localization Based on Weighted Surfacing from Crowdsourced Samples. Sensors. 2018; 18(9):2990. https://doi.org/10.3390/s18092990

Chicago/Turabian Style

Lin, Junhong, Bang Wang, Guang Yang, and Mu Zhou. 2018. "Indoor Localization Based on Weighted Surfacing from Crowdsourced Samples" Sensors 18, no. 9: 2990. https://doi.org/10.3390/s18092990

APA Style

Lin, J., Wang, B., Yang, G., & Zhou, M. (2018). Indoor Localization Based on Weighted Surfacing from Crowdsourced Samples. Sensors, 18(9), 2990. https://doi.org/10.3390/s18092990

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Indoor Localization Based on Weighted Surfacing from Crowdsourced Samples

Abstract

1. Introduction

2. Related Work and System Overview

2.1. Related Work

2.2. System Overview

3. The Offline Weighted Surfacing Algorithm

3.1. Weighting Crowdsourced Samples

3.2. Fitting Radio Surfaces

3.3. Weighting Fitted Surfaces

3.4. Constructing Subarea Fingerprints

4. The Online Positioning Algorithm

5. Field Measurements and Experiments

5.1. Experiment Settings

5.2. Surface Fitting Examples

5.3. Experiment Results

6. Concluding Remarks

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI