MapSentinel: Can the Knowledge of Space Use Improve Indoor Tracking Further?

Estimating an occupant’s location is arguably the most fundamental sensing task in smart buildings. The applications for fine-grained, responsive building operations require the location sensing systems to provide location estimates in real time, also known as indoor tracking. Existing indoor tracking systems require occupants to carry specialized devices or install programs on their smartphone to collect inertial sensing data. In this paper, we propose MapSentinel, which performs non-intrusive location sensing based on WiFi access points and ultrasonic sensors. MapSentinel combines the noisy sensor readings with the floormap information to estimate locations. One key observation supporting our work is that occupants exhibit distinctive motion characteristics at different locations on the floormap, e.g., constrained motion along the corridor or in the cubicle zones, and free movement in the open space. While extensive research has been performed on using a floormap as a tool to obtain correct walking trajectories without wall-crossings, there have been few attempts to incorporate the knowledge of space use available from the floormap into the location estimation. This paper argues that the knowledge of space use as an additional information source presents new opportunities for indoor tracking. The fusion of heterogeneous information is theoretically formulated within the Factor Graph framework, and the Context-Augmented Particle Filtering algorithm is developed to efficiently solve real-time walking trajectories. Our evaluation in a large office space shows that the MapSentinel can achieve accuracy improvement of 31.3% compared with the purely WiFi-based tracking system.


Introduction
The indoor location sensing technology has emerged as an inherent part of the "smart buildings" as it provides great potential for building operation improvement and energy saving. For instance, an on-demand ventilation or lighting control policy must know the usage of the building spaces, which may involve when building occupants enter or exit the building, where they inhabit, what time they occupy the spaces, the duration of occupancy, etc. Such applications require the location sensing systems to provide real-time estimate of occupants' locations, which is also termed "indoor tracking", in order to realize fine-grained, responsive building operations.

•
We build a non-intrusive location sensing network consisting of modified WiFi access points and ultrasonic calibration stations, which does not require the occupants to install any specialized programs on their smartphones and prevents the energy and occupant engagement issues.

•
We propose an information fusion framework for indoor tracking, which theoretically formalizes the fusion of the floormap information and the noisy sensor data using Factor Graph. The Context-Augmented Particle Filtering algorithm is developed to efficiently solve the walking trajectories in real time. The fusion framework can flexibly graft floormap information onto other types of tracking systems, not limited to the WiFi tracking schemes that we will demonstrate in this paper.
• We evaluate our system in a large typical office environment, and our tracking system can achieve significant tracking accuracy improvement over the purely WiFi-based tracking systems.
The rest of this paper expands on each of these contributions. We conclude the paper and discuss the future work in Section 6. Figure 1 presents the overall architecture of MapSentinel. There are three key components in MapSentinel: the non-intrusive sensing networks, the floormap processing engine, and the information fusion algorithm. The non-intrusive sensing networks, as the name suggests, generate location-related measurements without the need for computation on the smartphone end. Our sensing networks consist of WiFi access points (APs) and ultrasonic calibration stations, which track locations by relating the WiFi signal strength or the sound time-of-flight to the distance. The floormap processing engine converts the pictorial floormap to the information that can be directly combined with the sensor measurements in the fusion algorithm. The output of the floormap processing engine represents the prior knowledge obtained from the map, and can be computed in the offline phase. We will present the details of the main components of MapSentinel in this section.

WiFi Access Points
IEEE 802.11 (WiFi) is the most commonly used wireless networking technology with widely available infrastructure in large numbers of commercial and residential buildings. Nearly every existing commercial mobile device is WiFi enabled. The common method to utilize WiFi for indoor location sensing is to enable the mobile device to collect WiFi Received Signal Strengths (RSS) of nearby WiFi APs by installing an application on the mobile devices. Our system, on the contrary, leverages WiFi in a non-intrusive manner. Rather than modifying the hardware or software of occupants' mobile devices, we upgrade the software of the existing commercial WiFi APs to allow them to detect the RSS of each mobile device, while providing basic internet service to occupants as well. The RSS and media access control (MAC) address of each mobile device will be forwarded to the server and the occupant can be identified through the unique MAC address of the mobile device.

Ultrasonic Calibration Stations
Ultrasonic sensors measure the distance to the obstacle in the front to accurately position the object in its detecting range, which works by detecting the time of return, t, and the distance is given by: where v sound ≈ 340 m/s is the velocity of sound in the air. The advantages include centimeter-resolution distance measurements and limited span of detection angles, which make it suitable for online calibration of indoor positioning systems. Figure 2 demonstrates typical traces of the ultrasonic sensor readings when the occupant moves across the detection zones. By properly thresholding the distance measurements, the ultrasonic sensor can be used as an indicator of occupant presence inside its detection zone. The occupant moves across the detection zone The network consists of deployed ultrasonic stations and data collection center, which communicate with XBee radio modules operating the IEEE 802.15.4 standard, more specifically, the ZigBee protocols, as shown in Figure 3. The radios are low-power and can operate reliably in the indoor space, where the network can be automatically established by the coordinator, in our case, the data collection center. The data center controlled by Arduino enquires about the ultrasonic station for measurements periodically, so that the measurement frequency is 1 Hz, and transfers the data to the computer connected by serial ports. Each ultrasonic station is equipped with three ultrasonic sensors, whose directions are offset by 15 • . As the measurement range spans 15 • for each ultrasound, this covers an area of 45 • in the front of the station, which is sufficient for indoor area localization.  Figure 3. Illustration of the configuration of the ultrasonic calibration station. The coordinator requests measurements at 1 Hz frequency through the IEEE 802.15.4 protocol, and deposits collected data to the local database. The ultrasonic station takes three independent measures from its sensor points to detect occupant presence in the vicinity.

Floormap Processing Engine
The indoor space is well structured and typically organized into corridors, open areas, walls, rooms, etc. Depending on the occupant's present location, the motion is constrained by these external factors. For instance, an occupant on a particular corridor has high probability continuing its motion constrained along the corridor-or an occupant walking in the open area is free to move in any direction. Likewise, an occupant in his/her cubicle area is more likely to stay static. Based on different motion capabilities, we categorize the indoor space into several contexts, namely, open space, constrained space and static space. In addition, the floormap processing engine is designed to convert the original floormap into the contextual floormap that indicates the context of each point in the original floormap. The details of each component of contextual floormap is provided in Table 1. We use the word "canonical direction" to refer to the direction of constrained space along which the movement has more freedom. In addition, the occupant motion is also constrained by speed restrictions. Another function of the floormap processing engine is to compute the reachable set containing all the points visited with admissible speed from a given starting point. In the indoor space, the geographical distance between two positions in a floormap does not necessarily equal to the walking distance between them due to the block of walls and other obstacles. Hence, the physical features of the indoor environments would be ignored if the reachable set is confined within a fixed radius centered around the given starting point. The floormap processing engine addresses this problem by converting the floormap to a graph where all the non-barricade nodes connect to their neighboring non-barricade nodes and the barricade nodes do not have connections to any other nodes. In this way, the reachable set of a given node can be computed through finding the nodes within the maximum depth from the root node, which can be efficiently solved by breadth-first search algorithm [22].

Information Fusion Framework
In this section, we propose an information fusion framework that manages the heterogeneous sensor measurements as well as the floormap and occupants' context-related motion characteristics to provide an online estimate of occupants' location. There are two key components in the fusion framework: Context-Dependent Kinematic Models (CDKM) and Probabilistic Sensor Measurement Models (PSMM). CDKM is based on the observation that occupants' movements exhibit distinctive features in different parts of buildings as described in Section 2.3, and it captures this context-dependency by defining different kinematic models for distinctive contexts. PSMM models each sensor measurement as a probability distribution and multiple sensor data are combined via Bayes' rule to support the location inference.

Problem Formulation
Consider that the indoor space of interest is composed of M contexts, in each of which occupants exhibit a particular sort of kinematic patterns. Denote the context at time k as m k where m k ∈ {FS, CS 1 , · · · , CS R , SS}. The subscript of CS represents the index of the certain direction of constrained space and R is the total number of different directions. Let the state x k = (z k , m k ) consist of the position and velocity components of the occupant in the Cartesian coordinates z k = (x k , y k ,ẋ k ,ẏ k ) , as well as the context m k . If the position is known, the context can be uniquely determined by the contextual floormap. We characterize this correspondence via a function M : R 4 → R which assigns a specific context m k for z k . The tracking problem can be viewed as a statistical filtering problem where z k is to be estimated based on a set of noisy measurements y 1:k = {y 1 , · · · , y k } up to time k. Specifically, y k is the measurements available at time k, and, in our case, it includes measurements from multiple sensors, {y n k } N s n=1 where N s is total number of sensors deployed in the space of interest. We model the uncertainty about the observations and the states by treating them as random variables and assigning certain probability distribution to each random variable. In this setting, we want to compute the posterior distribution of the state given the measurements up to time k, i.e., p(z k |y 1:k ).
The impact of introducing context as an auxiliary state variable is manifold. Firstly, the transition of contexts m k−1 to m k determines the type of motion executed during the time interval (k − 1, k]. For instance, if the context remains the same, then the occupant should follow the motion type defined by the two identical contexts; on the contrary, if the context varies during (k − 1, k], then the occupant would execute the motion that is defined by neither of the contexts. For simplicity, we will assume a free motion. That is, the position/velocity state at time k, z k , depends on not only the past state z k−1 and m k−1 , but also the current context m k stochastically. Moreover, there is a deterministic mapping between z k and m k as is specified by the contextual map. In order to facilitate visualization and analysis of the complex dependencies among the variables, we use a factor graph to represent the states, observations and the functions bridging these variables, as illustrated in Figure 4. A factor graph has two types of nodes, variable node for each variable and function node for each local function, which are indicated by circles and squares, respectively. The edges in the graph represents the "is an argument of" relation between variables and local functions. For example, the function T k has four arguments, z k , z k−1 , m k−1 and m k . Three types of local functions are involved in our model: O k (z k , y k ) = p(y k |z k ): observation model, or how the unknown states and sensor observations relate. We will introduce PSMM where the relationship between locations and sensor observations is characterized by certain conditional probabilities and multiple sensor observations are combined via Bayes' theorem.
• C k (z k , m k ): characteristic function that checks the validity of the correspondence between z k and m k using the contextual floormap.
Note that the prior knowledge abstracted from the floormap is inherently accommodated to this problem by defining characteristic function and parameterizing the transition model as will be elaborated in the following section.

Context-Dependent Kinematic Model
We assume that given z k−1 , m k−1 and m k , the current position/velocity z k follows a Gaussian distribution, of which the mean and covariance matrix are specified as The equivalent state space model of Equation (2) is given by where F(m k−1 , m k ) ∈ R 4×4 determines the mean of the distribution of the next state. Let a denote the acceleration, we have the following kinematic equations, where T is the sampling period. We will assume constant velocity in this paper, and model a as a Gaussian noise term. If we manipulate Equations (4) and (5) into matrix forms, then it can be identified that F(m k−1 , m k ) has two possible values corresponding to moving or remaining static, F 1 imposes the velocity component of the state z k to be zero and F = F 1 when the context remains to be static space, i.e., m k−1 = m k = SS; otherwise, F = F 0 .
The matrix G is given by Q(m k−1 , m k ) stands for the process noise and, as the notation indicates, it is also a function of the context transition from k − 1 to k. We will adopt the concept of directional noise to handle the constraints imposed by the contextual map. To see this, note that occupants in the free space (m k−1 = m k = FS) can move in any direction with equal probability, therefore using equal process noise variance in both x and y direction, i.e., For occupants moving on the constrained space (m k−1 = m k = CS i , ∀i = 1, · · · , R) such as corridors, more uncertainty exists along than orthogonal to the corridor. Denote the variances along and orthogonal to the corridor by σ 2 a and σ 2 o (σ 2 a > σ 2 o ), respectively, and the canonical direction of the constrained space CS i is specified by the angle φ i (measured clockwise from y-axis). Then the process noise covariance matrix corresponding to the motion in the constrained space is given by The preceding model specification incorporates the scenarios where the context remains the same during the time interval [k − 1, k] and the occupant will keep the motion type defined by the two identical contexts. On the contrary, if the context switches during the time interval [k − 1, k], we will assume a free motion pattern, i.e., F = F 0 , Q = Q 0 . Table 2 summarizes our model given all possible context transitions. Table 2. Context-dependent kinematic models.

Context Transition
Model Specification

Probabilistic Sensor Measurement Model
We construct probabilistic models for each sensor and multisensor fusion can be performed via Bayes' rule. Assuming that N s different sensors function independently, then the observation model p(y k |z k ) can be factored as This actually forms a convenient and unified interface to combine distinctive sensor data by projecting the heterogeneous measurements (y n ) to the probability space via likelihood function, p(y n |z). If one more sensor is added into the system, then the observation model can be simply updated by multiplying the corresponding likelihood. Different likelihood functions requires being trained for different types of sensors.
WiFi Measurement. In the free space, the WiFi signal strength is a log linear function of the distance between the transmitter and receiver. However, due to the multipath effect caused by obstacles and moving objects in the indoor environments, the log linear relationship no longer holds. Previous work has proposed to adding a Gaussian noise term to account for the variations arising from the multipath effect; however, the simple model-based method can hardly guarantee a reasonable performance in practice. Another popular way is to construct a WiFi database comprising WiFi measurements at known locations to fingerprint the space of interest, but it requires onerous calibration to ensure the accuracy. We propose a novel WiFi modeling method based on a relatively small WiFi training set to accommodate for the complex variations of WiFi signals in the indoor space. The key insight is to use Gaussian process (GP) to model the WiFi signal where the simple model-based method provides a prior over the function space of GP.
We collect WiFi signal strength data at N w reference points over the space and let {l j , y where the mean function µ(·) is imposed to be a linear model with the parameters adapted to the training samples. The covariance function k(·, ·) takes the squared exponential form, where L and y w are the vectors concatenated by {l j } N w j=1 and {y j w } N w j=1 , respectively. K(l * , L) denotes the 1 × N w matrix of the covariances evaluated at all pairs of training and testing points, and similarly for the other entries K(L, L) and K(L, l * ). In previous work using GP to model the WiFi signal strength [24], the WiFi signal is assumed to follow the Gaussian distribution with the mean and variance given by Equations (13) and (14), respectively. However, the posterior variance derived from GP is a indicator of estimation confidence. It depends largely on the density of training samples in the vicinity of the evaluated position. That is, if the evaluated point l * happens to fall into the area that is densely calibrated, then the posterior variance will be relatively small. The posterior variance derived from GP cannot truly reflect the variations of WiFi signals over time. Therefore, instead of using the posterior Variance (14) in classical predictive equations, we model the likelihood as Ultrasonic Measurement. Essentially, each of the ultrasonic sensors in the ultrasonic station can output the distance to the occupant passing in front of it. However, due to the missing data and measurement noise, the distance measurement is not always steady. Here, we will consider the ultrasonic station to be a binary sensor to indicate the occupancy in its detection zone. To be specific, the likelihood function is modeled as p(y k < η|z k in the detection zone) = 1 (16) where η is the threshold for ultrasonic measurements.

Characteristic Function
The characteristic function imposes constraints on the correspondence between the position and the context, and embodies the prior knowledge available from the floormap. In the preceding section, we have defined a function M that sets up the relationship between the context and the position/velocity, i.e., m k = M(z k ), and M can be readily read out from the contextual map. We thereby define the characteristic function to be where I[·] is an indicator function. In other words, the characteristic function enforces the local correspondence defined by M.

Context-Augmented Particle Filter
In this section, we will discuss how to perform inference on the underlying factor graph of the tracking problem we formulated previously. The particle filter is a technique for implementing a recursive Bayesian filter by Monte-Carlo simulations [25]. The key idea of particle filter is to represent the required posterior density function by a set of random samples or "particles" associated with discrete probability mass, and compute the state estimate based on these "particles". The original particle filter proposed by Gordon et al. [26] was designed for a simple hidden Markov chain, which is also a cycle-free factor graph, using the Sampling Importance Resampling (SIR) algorithm to propagate and update the particles. However, the factor graph in our problem, as illustrated in Section 4, does have cycles due to the introduction of the context variable, and only approximate inference algorithms exist. We present a recursive approximate inference method for the cyclic factor graph by extending the particle filter and the resulting algorithm is termed Context-Augmented Particle Filter (CAPF).
To see the operation of the CAPF, consider a set of particles {z i k−1 , m i k−1 } N i=1 that represents the posterior distribution p(z k−1 , m k−1 |y 1:k−1 ) of the state. Note that m i k−1 can be uniquely determined by z i k−1 via the characteristic function. At time k, we have some new measurement y k . It is required to construct a new set of particles {z i k , m i k } N i=1 which characterizes the posterior distribution p(z k , m k |y 1:k ). Now, suppose we have an "oracle" that is capable of providing the context value m i k of the corresponding z i k even before we generate z i k 's, then our task is equivalent to draw samples from the distribution This can be carried out in two steps: First, the historical density p(z k−1 , m k−1 |y 1:k−1 ) is propagated via the transition model p(z k |z k−1 , m k , m k−1 ) to produce the prediction density p(z k |m k , y 1:k−1 ) = p(z k |z k−1 , m k )p(z k−1 |y 1:k−1 )dz k−1 (19) where p(z k |z k−1 , m k ) = p(z k |z k−1 , m k , m k−1 ) since m k−1 is completely determined conditioning on z k−1 . Second, our interested density p(z k |m k , y 1:k ) can be updated from the prediction density using Bayes' theorem, p(z k |m k , y 1:k ) = p(y k |z k )p(z k |m k , y 1:k−1 ) p(y k |y 1:k−1 , m k ) (20) = γp(y k |z k )p(z k |m k , y 1:k−1 ) where γ is a normalization constant. Thus, Equations (19) and (20) form a recursive solution to Equation (18). In particle filter framework, the aforementioned prediction and update steps are performed by propagating and weighting the random samples. Prediction Step. In the prediction phase, we generate the predicted particles by is a set of particles representing the estimates of m k produced by the "oracle". Given the different possible values of m i k−1 and m i k , z i k will be sampled from different models, detailed in Table 2. We will then perform sanity check on newly generated particles, where the particles z i k absent from the reachable set of z i k−1 will be eliminated. Update Step. To update, each predicted particle z i k is assigned with a weight proportional to its likelihood.
The weight is then normalized by We resample N times with replacement from the set Correspondingly, the contexts m i k 's are obtained through the characteristic function, i.e., "Oracle" Design. The oracle is supposed to be able to answer the query about the next possible contexts m k , based upon which the position/velocity component of the state can be properly propagated according to different transition models. For computational efficiency, we adopt a simple discriminative model to produce m k 's. Given a small database of WiFi fingerprints, we apply the K-Nearest Neighbors (K-NN) algorithm and a modified distance weighted rule to generate an empirical distribution of the context. To be specific, let the WiFi database be denoted by {(m j , y j w )} N w j=1 , and N w is the number of WiFi fingerprints. When the new WiFi observation y k is querying the possible contexts, the K nearest neighbors of y k are found among the given training set. Let these K nearest neighbors of y k , with their associated context, be given by {(m j , y j w )} K j =1 . In addition, let the corresponding distances of these neighbors from y k be given by d j , j = 1, · · · , K. The weight attributed to the j th nearest neighbor is then defined as We then normalize the weights, q j = q j ∑ K j =1 q j , and sample the context according to the following discrete probability distribution, where α is a context resilience factor and α ∈ [0, 1]. We incorporate α to accommodate for the prior knowledge that the context will not change too often and to make the "oracle" more robust to the observation noise. Moreover, for the particles on the boundary of distinctive contexts, m k is equally probable to be these contexts. The pseudo-code of the CAPF algorithm is provided in Algorithm 1.

Performance Evaluation
Our experiment was carried out in the Singapore-Berkeley Building Efficiency and Sustainability in the Tropics (SinBerBEST) located in CREATE Tower at the National University of Singapore campus, which is a typical office environment consisting of cubicles, individual offices, corridors and obstacles like walls, desks, etc. The total area of the testbed is around 1000 m 2 . There are 10 WiFi routers and four ultrasonic stations deployed in the testbed in total. We utilize TP-LINK TL-WDR4300 Wireless  Experimental methodology. In a real-world setting, we expect the occupant to carry the smartphone as they walk through various sections of an indoor space. Moreover, occupants are unlikely to walk continuously; they would walk between locations of special interest and dwell at certain locations for a significant length of time. Our experiment aims at emulating these practical scenarios in an office environment and incorporating all the contexts defined in our model. Therefore, the following routes were designed as the ground truth for evaluation: (1) A enters the office from the front gate and walks through the corridors to find her colleague (different CSs are included); (2) B enters the office from the side door, walks to her own seat, stays there for a while and exits the office from the front gate (CSs, SS are included); (3) C enters the office from the front gate, walks through corridors, takes some time at her office and goes to the open area (CSs, SS, FS are included). We asked the experimenter to behave as usual when walking in the space. At the same time, the WiFi APs and ultrasonic stations constantly collect the measurements and send them to the central server.
To obtain the ground truth at the sampling time of the tracking system, we mark the ground with a 1 m grid on the pre-specified route and ask the experimenter to create lap times with a stopwatch when happening to be on the grid. By recording the starting time of the experiment, we can obtain the time stamp of each grid and then interpolate the ground truth at the sampling time.
Does the "oracle" work? The current context estimation done by the "oracle" is critical to the CAPF algorithm, as the tuple of the current and previous context jointly steer the states in our model. Here, we would like to evaluate the context prediction performance of the "oracle" we constructed in light of several design rules presented in the Section 4. Figure 6 illustrates the result of the context estimation for different walks. Since the context estimates are represented by a set of particles in the algorithm, we visualize the context estimate by the purple lines centered at the possible contexts, and the lengths of the purple lines are scaled by the proportions of the particles of different contexts. Ideally, the purple cloud should scatter around the ground truth context. Figure 6 suggests that the estimates given by the "oracle" can generally capture the ground truth. Evidently, the context estimate is not perfect, especially for the static space (SS). However, these approximate "ground truths" essentially present other possibilities of the current context and avoids particles trapping in the static space. We define the context estimation accuracy to be the ratio of the number of particles with correct context estimate to the total number of particles. The context estimation accuracy is calculated for each time step of the experiments, and the empirical distribution of the context estimation accuracy is illustrated in Figure 7, where the mean accuracy is 52.41%. With this noisy "oracle", the system can achieve median tracking error of 1.96 m, while the tracking error would be 1.84 m if a perfect "oracle" was utilized. Therefore, our work has the potential to be further improved with a more advanced "oracle" design.  Figure 8 demonstrates some snapshots of the CAPF algorithm in progress. At the beginning, the particles are initialized to be uniformly distributed in the space. In addition, the spread of the particles shrinks as the new WiFi observations come. When the ultrasonic station reports a detection, the particles are concentrated in the corresponding detection zone. As the occupant exits the detection zone, the particles spread out along the direction of the corridor. When the occupant sits in the cubicle, the particles distribute over the seating area as well as some possible routes through which the occupant might leave the seating area. The particles distribute evenly along different directions when the occupant is moving in the free space, in which case our model is identical to the traditional constant velocity dynamic model for the particle filter.  MapSentinel's tracking performance. We aggregate the data from different walks and compare the performance of MapSentinel against the fusion system of WiFi and ultrasonic station without leveraging the floormap information, as well as the purely WiFi-based tracking system. The tracking error distributions are depicted in Figure 9. As can be seen, the MapSentinel achieves an essential performance improvement, 31.3% over the WiFi tracking system and 29.1% over the fusion scheme. Note that adding the ultrasonic calibration into the WiFi system is able to realize a small amount of accuracy increment. Due to the high degree of uncertainty of WiFi signals, the effect of ultrasonic calibration will not last for long. The map information elongates the effect of the ultrasonic calibration via imposing additional constraints to the motion, and that is why MapSentinel greatly enhances the tracking performance compared with the purely WiFi-based system. We also evaluate the tracking performance in different contexts, and the result is shown by boxplots in Figure 10. Here, "without map" means using the WiFi and ultrasonic sensing systems without taking into account the reachable set as well as the context-dependent kinematic model. A unified dynamical model, the free space model, is applied in this case, and a traditional particle filter is implemented to estimate the location.
As can be readily read from the figure, the MapSentinel performs better in all contexts. More significant increase is achieved in constrained spaces and static spaces, as expected.
Tracking Error (m)   Figure 11 compares the performance of tracking systems with distinctive floormap usage. MapSentinel exploits the floormap information in two folds: first, MapSentinel integrates the context information into the kinematic model, and the movement patterns of people on different locations of the map are better captured. Secondly, MapSentinel takes into account the speed restrictions as well as physical obstacles in the indoor space by checking if the particles fall inside the reachable set at each time step. The second fold of the floormap information has been widely utilized in the previous work, while the context information is less explored. We therefore compare the tracking error of our system with the one that merely uses the reachable conditions. Figure 11 shows that incorporating information about physical constraints, as the previous work did, is surely beneficial to the tracking system. Particularly, the performance can be further improved by 19.8% by introducing the context information into the tracking system. To better understand how the map helps improve the location estimation, we demonstrate the velocity estimation of different tracking schemes in Figure 12. Typically, the occupants will not perform complex motions in the indoor space due to the constraints of the wall and other barricades. The more the velocity estimate deviates from the canonical directions defined by the indoor environment, the worse the tracking performance can be. Using the fusion schemes of WiFi and ultrasonic calibration, only the location is the observable state. The velocity estimates depend largely on the location estimate and it has little effect in smoothing out the location estimate. Hence, extensive research has been focusing on using inertial measurements to perform dead reckoning, which makes the velocity observable. Analogously, the MapSentinel creates a virtual inertial sensor for the occupant, which mimics the actual inertial sensor to provide the possible walking speed and directions. As is shown in Figure 12, the velocity estimation without map information tends to point to any direction while the MapSentinel constrains the velocity via the context-dependent kinematic model.

Conclusions
This paper presents MapSentinel, a system for real-time location tracking that emphasizes both non-intrusiveness and accuracy. The non-intrusive sensing networks comprise the modified WiFi access points and the ultrasonic calibration stations. The MapSentinel makes novel attempts to exploit the floormap information by categorizing the indoor space into different contexts to capture the diversity of typical motion characteristics. This mimics having an inertial sensor attached to the occupant to obtain the knowledge of velocity. We formalize the fusion of floormap information as well as the noisy sensor readings using the Factor Graph, and develop the Context-Augmented Particle Filtering algorithm to efficiently solve real-time walking trajectories. Our evaluation in the large typical office environment shows that MapSentinel can achieve the performance improvement of 31.3% over the purely WiFi-based tracking system. MapSentinel is among the early attempts to obviate the need for the inertial sensors in indoor tracking, and our results are promising.
For future work, we would like to explore multiple occupant tracking. The ultrasonic sensor is essentially anonymous and cannot identify the occupant entering its detection zone. The WiFi access points are able to identify the occupant from the MAC address of the mobile device and can approximately tell which occupant is approaching the ultrasonic station. The ultrasonic calibration will work if the occupant can be identified with the MAC information without ambiguity; however, if the identity of the occupant within the range cannot be uniquely determined, as in the crowded scenario, the calibration may not work effectively. Further work to reliably track multiple occupants is necessary. Moreover, we would like to integrate our tracking method to the control of lighting and ventilation systems to improve energy efficiency of buildings.