Unsupervised Learning in RSS-Based DFLT Using an EM Algorithm

Received signal strength (RSS) changes of static wireless nodes can be used for device-free localization and tracking (DFLT). Most RSS-based DFLT systems require access to calibration data, either RSS measurements from a time period when the area was not occupied by people, or measurements while a person stands in known locations. Such calibration periods can be very expensive in terms of time and effort, making system deployment and maintenance challenging. This paper develops an Expectation-Maximization (EM) algorithm based on Gaussian smoothing for estimating the unknown RSS model parameters, liberating the system from supervised training and calibration periods. To fully use the EM algorithm’s potential, a novel localization-and-tracking system is presented to estimate a target’s arbitrary trajectory. To demonstrate the effectiveness of the proposed approach, it is shown that: (i) the system requires no calibration period; (ii) the EM algorithm improves the accuracy of existing DFLT methods; (iii) it is computationally very efficient; and (iv) the system outperforms a state-of-the-art adaptive DFLT system in terms of tracking accuracy.


Introduction
In developed nations, the demographic change in the population is creating many challenges both from a societal and an economic standpoint. Research into aging, agerelated conditions, and the means to support an aging population has therefore become a priority for many governments around the world [1]. Ambient assisted living (AAL) is the European Union's funding program that aims at developing "information and communication technologies (ICT) in a person's daily living and working environment to enable them to stay active for longer duration, remain socially connected and live independently into old age" (www.aal-europe.eu, accessed on 17 August 2021). This can be achieved by developing both preventive and monitoring systems for aging safely at home, in the community, and at work. In this regard, radio frequency (RF) sensing is particularly suitable for realizing AAL systems. The main advantages of the technology are four-fold: 1. it is passive and does not require users to carry obtrusive and uncomfortable sensors [2]; 2. it is suitable for a wide range of monitoring purposes such as localization and tracking [3], fall detection [4], vital sign monitoring [5], gesture recognition [6] and behavioral sensing [7]; 3. RF signals can penetrate walls, clutter and other occlusions, unlike many other sensors that have a limited field of view [8]; and 4. it is privacy preserving which increases acceptance of the monitoring technology, unlike vision-based systems that are intrusive [9]. The components of a typical and proposed DFLT system. The gray shaded block and localization-and-tracking block is common to conventional systems, whereas the proposed system is composed of the non-shaded blocks only. The notation is introduced in Sections 3 and 4.
The work in this paper addresses three major problems associated with the calibration of typical DFLT systems. First, most DFLT methods require an empty-room calibration period when the area is not occupied by people to calculate the mean RSS level µ [2,16]. Typically, the RSS changes are defined with respect to µ and an accurate estimate is a strict requirement of DFLT. Second, it is a common presumption that the model parameters are the same for all links [17,18], i.e., for all transmitter, receiver and frequency channel combinations. Third, model-based DFLT methods that take into account the unique model parameters for each link, require a calibration period when a person moves along a known training trajectory to estimate them [13,19]. If µ is constant and the model parameters are the same for all the links, only a short empty-room calibration period is enough. However, neither of these are valid assumptions in general scenarios. In Figure 2, the measured and modeled RSS as a function of excess path length (see Equation (6)) for two example links is illustrated. As shown, the model parameters of the two links differ significantly from one another invalidating the common assumption. The used model is the exponential model [20] (see Equation (26)) with parameters, Θ = [µ, φ, λ, σ 2 ], which are estimated using nonlinear least squares over the measurements and known trajectory. This paper aims to address the limitations and drawbacks of typical DFLT methods by developing a system that does not require separate empty-room calibration periods and that can learn the unique model parameters for each link during the time of operation. The development efforts of this paper are validated using a 75 m 2 open indoor deployment and a 82 m 2 residential apartment deployment, and it is shown that the proposed system can achieve an average tracking accuracy as low as 17 cm in the open environment and 37 cm in the apartment. This paper makes the following contributions: • A Gaussian filter is presented to estimate the state of the target and a novel Measurement Selection unit is developed to select and combine the measurement models of two DFLT methods into one filtering algorithm. The developed system is demonstrated to outperform a state-of-the-art adaptive DFLT system and reduce the tracking error by 42%. • A Gaussian smoother is implemented, and it is used to evaluate the expectations involved in the expectation step of the Expectation-Maximization (EM) algorithm. Moreover, we show how the maximization step of the EM algorithm is available in closed form for the considered measurement model. The presented EM algorithm is computationally very efficient, up to 18 times faster than current solutions used in the literature. • An EM algorithm is presented for estimating the unknown RSS model parameters, liberating the system from the need for supervised training and calibration periods. It is demonstrated that the EM algorithm not only improves the accuracy of the introduced system, but also other DFLT systems.

•
The experiments conducted in this paper, together with Matlab code to run the presented filtering, smoothing and EM algorithms are made publicly available and are published in [21]. The aim is to lower the threshold to start research in the area and advance the field of DFLT in general.
The rest of the paper is organized as follows. Related work is discussed in Section 2. Section 3 formulates the problem and presents the localization-and-tracking system. The parameter estimation framework is presented in Section 4. The experiments that were conducted are introduced in Section 5 and the results are presented in Section 6. Thereafter, conclusions are drawn.

Related Work
In this section, related works that use calibration and training are summarized. We begin the section by introducing the most common method in DFLT-using an emptyroom calibration period. Systems that use online training are discussed thereafter. Lastly, works that calibrate the model using supervised or unsupervised training are presented. Empty-room calibration-Most DFLT systems define the RSS changes with respect to µ, often referred to as the reference or baseline RSS. The system performance depends on an accurate estimate of µ and therefore, it is typically calculated over several minutes when the area is not occupied by people [2,18,22]. In this paper, µ is one of the parameters estimated by the EM algorithm and it is shown that the system can be initialized without having an empty-room calibration period.
It is worth noting that methods that do not define the RSS changes with respect to µ have also been proposed [8,23]. Instead, these methods calculate the sample RSS variance over a fixed time window and do not require calibration. A downside is that variance-based methods cannot locate stationary targets because in such scenarios, the variance drops close to the noise floor and does not show up in the estimated image.
Online calibration-Several works have proposed calibrating the reference RSS µ [14,24], or measurement noise variance σ 2 [25,26] online. The system can be deployed without a calibration period when the reference RSS is estimated on-the-fly. Moreover, improved estimators can be developed when better noise models are available [23,25,26]. These approaches first estimate the target state and then the parameters in a sequential order. However, such a decoupling is not always possible, or degrades the system performance if the state estimates are inaccurate. Unlike online calibration methods, this work uses the EM algorithm to estimate µ and σ 2 from batches of data that is collected while the person is inside the monitored area.
Estimating the RSS distribution (defined by µ and σ 2 ) in two states, when a person is crossing or not crossing the imaginary link line between the transceivers, has been explored in [27][28][29][30][31]. Not only can this information be used to localize people, but also to determine if the monitored area is occupied-a non-trivial task in RF sensing. The aforementioned works model the target location as a binary state: either the person is in between the transmitter (TX) and receiver (RX), or they are not. In this paper, we use a continuous measurement model and there is no need to explicitly determine whether a target is crossing the link or not.
Model calibration-The works in [13,15,19] use an offline calibration phase for estimating parameters of the measurement model. During calibration of the parameters, a person moves along a known training trajectory and visits locations of interest. The RSS is recorded between the static wireless nodes and the measurement model parameters are estimated. During the online phase, the calibrated measurement model is used in the tracking algorithm. The works cited above demonstrate high tracking accuracy, but the calibration phase is inconvenient since it can take up to 30 min as in [15]. In this paper, the estimated trajectory and EM algorithm are used for unsupervised learning. The position estimates can be inaccurate in the beginning, but as the person moves in the area, the model parameters can be estimated more accurately resulting in improved tracking performance. The proposed parameter estimation method does not require human intervention other than normal movement in the area.
Parameter estimation-This present work is most closely related to the developments in [22,32], which have an online calibration module as well as a batch-estimation module for tuning the model parameters. The aforementioned works use an imaging solution, and the accuracy is inevitably affected by the binary measurement model and resolution of the pixels. Instead, we use a continuous measurement model, and the implemented Bayesian filter directly estimates the kinematic state of the target using the RSS and higher tracking accuracy is expected [20]. Furthermore, the existing approaches use a nonlinear least-squares solution to estimate the model parameters [22,32], while the proposed method solves the problem using a maximum likelihood approach based on a computationally efficient EM algorithm. Since unsupervised learning depends on accurate position estimates, the introduced system is superior with respect to [22,32]. The experimental results demonstrate that the proposed system can reduce the tracking error by 42% or more and the EM algorithm is computationally up to 18 times faster than the nonlinear optimization method used in [22].
Expectation-maximization has also been used for RSS-based DFLT in [20,33]. However, these approaches use a mini-batch-and particle-filtering-based online EM approach. Although the online EM approach is attractive for on-the-fly estimation of the parameters and rapid adaptation to changing environments, the method also suffers from some important drawbacks. First, since the expectation step is based on particle filtering, which yields a degenerate approximation of the smoothing posterior density required by EM [34,35], it is computationally heavy, and the estimation of the marginal log-likelihood may be poor. Second, in [20,33] the maximization step cannot be done in closed form, and it is implemented by propagating a set of sufficient statistics and the numerical integration is carried out using importance sampling. In contrast, in this paper, the expectation step is calculated using a Gaussian approximation for the smoothing distribution which can be computed efficiently and does not suffer from trajectory degeneration. Furthermore, we show that the maximization step of the EM algorithm is available in closed form for the considered measurement model and implemented Gaussian smoother. Hence, with respect to [20,33], the solution presented in this paper is more tractable in terms of the approximation of the expectation and maximization steps and computational complexity. In addition, the EM algorithm used in [20,33] is only evaluated with simulations, whereas we validate our proposed method using experimental data. The EM algorithm is widely used in different applications and the readers are referred to [36] for an introduction to parameter estimation and to [37,38] for a more general treatment of parameter estimation in nonlinear dynamical system using Gaussian filtering and smoothing.

Localization and Tracking
This work aims to track the kinematic state of a target using the RSS measured between static wireless nodes. The components of the introduced Localization-and-Tracking unit and their relations are visualized in Figure 3 and presented in the following. We begin by presenting the models used for localization and tracking. Thereafter, the estimation tasks are presented, and they are performed by two complementary blocks: (i) the Radio Tomographic Imaging (RTI) unit summarized in Section 3.2, and (ii) the Extended Kalman Filter (EKF) unit presented in Section 3.3. The EKF uses the combination of RSS and RTI position estimates in the measurement update, and the Measurement Selection unit selects and combines the measurements as described in Section 3.4.
The idea to augment the EKF with RTI position estimates was originally presented in [3]. The localization-and-tracking system presented in this paper further improves the filter by introducing a Measurement Selection unit which selects and combines the measurements in a way that enhances the tracking accuracy. In addition, we propose a novel RTI positioning scheme that also estimates covariance of the position estimates. Furthermore, the developed localization-and-tracking system performs no low-pass filtering of the RTI images in contrast with the system presented in [3]. The reason is that image filtering has a negative impact on the parameter estimation algorithm since it introduces a lag in the state estimates and causes correlated position errors.  For DFLT, the state of the system in two-dimensional Euclidean space can be defined as where x k and y k are the xand y-coordinates, and the velocity components are denoted aṡ x k andẏ k . This state representation is particularly suitable for DFLT because the position and velocity define the temporal and spectral properties of the RSS [39]. This state evolves at time k in accordance with where F is the state transition matrix of the dynamic model and q k−1 ∼ N (0, Q) is Gaussian process noise. As a person is not expected to change velocity very rapidly and unexpectedly, a common choice of F in RSS-based DFLT [18,22] is the second-order kinematic model [40], given by where I 2 is an identity matrix, ⊗ the Kronecker product, q the power spectral density of the process noise and τ the sampling period.

Measurement Model
Consider a wireless network, where each of the S nodes can communicate with the other S − 1 nodes. Moreover, the wireless devices can communicate on C different frequency channels. Each transmitter, receiver and channel combination is a unique link and the total number of measured links is L = C · S · (S − 1). It is to be noted that full connectivity is not mandatory for DFLT. It is also to be noted that we do not assume channel reciprocity. The reason being, although the radio channel is reciprocal, measurements of the radio channel are not reciprocal, and parameters of the reciprocal link can be different [41].
The nodes communicate in round-robin fashion and at time k, one node transmits and the other nodes receive. At the next time instant, k + 1, the transmission turn is assigned to the next node in the schedule. The nodes transmit sequentially, and one communication cycle consists of one transmission by every node. At the end of the communication cycle, the nodes switch simultaneously to the next frequency channel in a predefined list. Thereafter, a new communication cycle is initiated. Once each node has transmitted on every frequency channel, the schedule is restarted from the beginning.
For the considered problem, the measurement system at time k can be defined as where y k ∈ R (S−1)×1 is the measured RSS, I k ∈ R (S−1)×L a deterministic link indicator matrix defined by the schedule (see Section 3.3.1), H(x k )θ the linear-in-parameters mea-surement model and r k ∼ N (0, R) is Gaussian measurement noise. The human-induced RSS changes are modeled using an exponential model [20] and the complete linear-inparameters model is defined as where λ is the decay rate, µ l the reference RSS and φ l the measurement gain, H(x k ) ∈ R L×2L and θ ∈ R 2L×1 . In (5), the excess path length ∆ l,k defines an ellipse with the foci at the TX and RX and it relates the person's location p k = x k y k T to link l with TX m and RX n by where p m and p n denote the TX and RX positions in respective order. Lastly, the measurement noise covariance is assumed diagonal and it is defined as R = diag σ 2 1 , σ 2 2 , . . . , σ 2 L . It is to be noted that the RSS can be measured at most for S − 1 links simultaneously because only one node transmits at a time. Moreover, to measure the L links takes S · C transmissions and S · C · τ duration of time.

Image Estimation
RTI estimates a discretized RSS change field, denoted by b c , using the RSS of J = S(S − 1) links measured on frequency channel c. As in [42], the RSS is assumed to be a linear combination of voxel changes plus noise where z c ∈ R J×1 is the mean-removed RSS, W c ∈ R J×N a weight matrix that relates the spatial change field b c ∈ R N×1 to the RSS, N the voxel number and r c ∈ R J×1 the measurement noise. The measurement vector and noise covariance in (4) can be decomposed as where y c and R c denote the RSS and measurement noise covariance on channel c. Now, the RTI measurement and noise vectors are related to the model in (4) via z c = y c − µ c and r c ∼ N (0, R c ). The minimum mean square error estimate for the model in (7), with zero-mean The covariance matrix Σ b for pixels m and n is [42] where σ 2 b is the variance of each pixel and δ d is a user-defined space constant. For link l and pixel n, we define the elements of W c as where φ l and λ are the measurement gain and decay rate of the model defined in (5), φ l is a normalization term and ∆ l,n the excess path length. In the literature, W has taken many forms, and the reader is referred to [16,43] for further details. The projection matrix Π c is channel dependent and it is computed independently for each of the channels. However, Π c must be computed only once at the beginning of the experiment and the real-time computation of the image requires only one matrix multiplication, of O[NL] multiplications and additions. The spatial change field is estimated at the end of each communication cycle when k mod S = 0 and z c contains measurements from time instant k − S + 1 to k. The image estimate on channel c is denoted from now on asb k , to prevent using two time notations.

RTI Positioning
For a single target, localizing the person can be postulated as finding the mode ofb k since it is expected that the pixels with highest intensity locate near the target [44]. The mode is in the set of pixels with intensity higher than γB, where B = max(b k ) denotes the maximum component ofb k and 0 ≤ γ ≤ 1 is a threshold. The threshold is a tuning parameter between two extremes: if γ = 0 all pixels are taken into account and if γ = 1 only a single pixel is accounted for, and we have empirically found that γ = 0.7 provides a good overall performance. Let us definẽ where w =b k / ∑b k . Now, the position estimate and sample covariance are given by: where p n = x n y n T are the pixel coordinates and w n the weight for pixel n.
An example is illustrated in Figure 4 in which two RTI images are shown together with the estimates given in Equations (12) and (13). The image on the left is an ideal RTI image, pixels withb k ≥ γB are centered around the target location and the image has very little noise resulting in an accurate position estimate and small covariance. The image on the right is very noisy and the image is multimodal. As a result, the estimated position does not accurately indicate the target location and the estimated covariance is significantly higher than in the other image. However, estimating the covariance allows taking such uncertainties into account and the tracking filter developed in the next section gives less weight to position estimates that are estimated from noisy images, such as the one on the right of Figure 4.

Tracking Filter
The extended Kalman filter (EKF) computes the marginal posterior distribution of x k for each time step k using the data y 1 , . . . , y k and assuming Gaussian approximations for the filtering densities so that p(x k | y 1:k ) ≈ N (x k | m k , P k ). Different than conventional Bayesian filtering implementations for DFLT [13,18,20], in this work, the measurement model of the filter is augmented with the position estimates from RTI as in [3]. This bounds the filter's measurement residuals by the position errors of the imaging approach. Therefore, the developed filter has the robustness of an imaging method and the tracking accuracy of a Bayesian filter. The filtering algorithm consist of three steps: (i) prediction step, (ii) model selection, and (iii) measurement update step. We simply refer to the introduced filter as EKF, although it is more complex than a first-order filter that would solely use RSS. In the following, we first present the observation model of the EKF and thereafter, the prediction and update steps of the filter.   (12) and (13). In the image, the deep blue regions indicate areas that are not occupied by people and the bright regions indicate estimated obstructions. Furthermore, the plus sign indicates the true position, the crosses are the position estimates, and the dashed line illustrates the 3σ uncertainty ellipse [3], reprinted with permission from ref. [3]. Copyright 2019 IEEE.

EKF Observation Model
Recall that at a given time instant k, at most S − 1 links are measured. Instead of using the complete model defined in Equation (5), the EKF operates on a subset of the measurement model. We refer to the subset as the observation model and essentially, it contains the measurements and associated models sampled at time k. Thus, the observation model is defined by the TX and channel identifiers, and it changes with time. To explicitly define the observation model, consider the set of nodes S = {1, 2, . . . , S} and the set of channels C = {1, 2, . . . , C}. Then, the link index l corresponding to the transmission by node i ∈ S on frequency channel c ∈ C and received by node j ∈ S is The RXs that measure the RSS at time k are R = S \ {i} and |R| = S − 1. Now if m = [1, . . . , |R|], we can define the indices of the link selection matrix, measurement model, and noise covariance as follows: In addition, the EKF requires the Jacobian of H(x k )θ, and the elements of this matrix are given by where p i and p j denote the positions of nodes i and j. The Jacobian for m is

Prediction Step
Given that the dynamic model in (2) is linear, the prediction step of the first-order additive noise EKF can be expressed as [36] where m k and P k denote the state estimate and covariance in respective order, and m − k and P − k are the predicted mean and covariance.

Measurement Update
The measurement selection unit presented in Section 3.4 calculates the measurement residual ν k and forms the associated measurement noise covariance matrix R and measurement model matrix H. Using these, the mean m − k and covariance P − k can be updated using [36] S

Measurement Selection
The DFLT system implementations using Bayesian filtering or imaging (in particular RTI) have different characteristics. Depending on the target's position and system deployment, the performance of the introduced filter can be improved by enabling or disabling certain measurements. For example, the covariance of the RTI position estimate can be small and biased. On the other hand, the filter can converge to an incorrect trajectory and the estimated covariance is not able to account for the uncertainties in the state estimate. To solve these issues, a logic to select the effective measurements is introduced. The procedure is based on the normalized innovation squared (a.k.a. square of the Mahalanobis distance) [40] where is the linear measurement model. The test statistic has a χ 2 distribution with two degrees of freedom and it can be used to assess whether the realized RTI estimate is unexpectedly large with respect to the prior predictive distribution. In addition, the square of the Mahalanobis distance between two successive RTI estimates is calculated wherep k−S and C k−S denote the previous RTI position estimate and covariance. The test statistic can be used to assess whether the prior predictive distribution has converged to an incorrect trajectory. For simplicity, the index notation is dropped and the measurement model, measurement noise covariance matrix, and Jacobian are simply denoted as: H(x k )θ ∈ R |R|×1 , R ∈ R |R|×|R| and H x ∈ R |R|×4 , respectively. The resulting logic to select the measurement models is presented below in which T denotes the confidence interval of the χ 2 distribution with two degrees of freedom.
• if 1 > T and 2 ≤ T -It is likely that the filter has diverged. Use only the output of RTI, i.e., R = C k , H = H and ν k =p k − Hm − k .
• else if 1 ≤ T -Normal operation, concatenate the models: R = blkdiag(C k , R), • else-The RTI position estimate is likely inaccurate, use only the RSS measurements, i.e., R = R, H = H x and ν k = y k − H(m − k )θ. The measurement residual ν k , measurement noise covariance matrix R and measurement model matrix H are used by the EFK update step presented in Section 3.3.3.

Parameter Estimation
In this section, an EM algorithm based on Gaussian smoothing is developed. The section begins by introducing the Gaussian smoothing recursion for the considered problem. Thereafter, we show how the developed Gaussian smoother can be used to evaluate the expectations involved in the E-step of the EM algorithm. We also derive the solution of the maximization problem in the M-step in closed form for the considered measurement model.

Gaussian Smoothing
In this section, we present the smoothing recursions for the Rauch-Tung-Striebel smoother (RTSS). The RTSS computes the marginal posterior distribution of the state by conditioning on the whole measurement data. The smoothing solution is given by where K ≥ k, which can be calculated recursively. The smoothing recursion starts from the last time step K and proceeds backwards to the first time step, and the recursion is given by [36] In the EM algorithm, the expectation is over the smoothing distribution (see Equation (24)) and the obtained smoother result is used by the EM algorithm.

Expectation-Maximization-Based Parameter Estimation
Likelihood-based parameter estimation approaches seek to estimate the unknown model parameters Θ from the marginal likelihood p(y 1:K | Θ). In general, these methods maximize the logarithm of p(y 1:K | Θ) to find the maximum likelihood (ML) estimate of Θ, given by The likelihood is over the joint density of the measurements and the latent state variables.
Since computation of the high-dimensional integral required in marginalizing the states out is practically impossible, in this paper, we use the EM for approximating ML estimation.
The key idea behind the EM algorithm is that the marginal likelihood can be maximized by iteratively maximizing its lower bound, which is equivalent to maximizing [36] Q(Θ, Θ (i) ) = E{log p(x 0:K , y 1: where the expectation is with respect to p(x 0:K | y 1:K , Θ (i) ) and Θ (i) denotes the parameter estimate at iteration i. The expectation step of the EM algorithm is equivalent to computing (24) over the smoothing distribution, whereas the maximization step aims at maximizing Q(Θ, Θ (i) ) with respect to Θ.
Using the properties of the state-space model and noting that the parameters only enter the measurement likelihood, maximizing Q is equivalent to maximizing (see, e.g., [36,38]) Furthermore, for the DFLT problem considered in this paper and under the assumption that the measurement noises of the individual links are mutually independent, zeromean Gaussian noise, r l,k ∼ N (0, σ 2 l ) and Cov{r l , r j } = 0, ∀ l = j, the parameters can be estimated independently for each of the L links as follows. First, recall that the measurement model of the lth link is given by Then, it is shown in Appendix A that the M-step maximizing the approximation of Q(Θ, Θ (i) ) in (25) has a closed form solution and the parameter update is given bŷ where K is the number of measurements of link l, and B, D and G are calculated using the latest smoother results as follows E{h l (x k ) | y 1:K } T y l,k , The expectations used to calculate B and G involve nonlinear transformations of x k which can be approximated using Taylor series expansion. This yields E{h l (x k ) | y 1:K }≈h l (m s k ), where m s k and P s k are the mean and covariance of the smoother result and H x is the Jacobian of h l evaluated at m s k (see Equation (16)). For a more detailed treatment of EM-based parameter estimation please refer to, for example, [36][37][38].
Due to quantization of the RSS, the estimated variance may be zero even though the true real-valued received power would have had a positive variance. We apply shrinkage, which imposes an L2-penalty on the estimated covariance matrix, to assure positive variance and avoid numerical instability. In practice, the L2-penalized ML estimate is given by the simple convex transformation: whereR = diag σ 2 1 ,σ 2 2 , . . . ,σ 2 L , α is the shrinkage coefficient, TrR denotes the trace of the matrix and I L is an identity matrix.

Experiments
The development efforts of this paper are demonstrated using Texas Instruments CC2531 USB dongle nodes [45]. The nodes operate on the 2.4 GHz ISM band and communicate on a set of frequency channels C ∈ {11, . . . , 26} defined by the IEEE 802.15.4 standard [46]. The wireless nodes follow a round-robin schedule as discussed in Section 3.1.2. In the transmitted packets, the nodes include the most recent RSS measurements, associated with the transmissions of other nodes. The time interval between the communications is approximately, τ ≈ 2.9 ms, defining the sampling period for the system. A base station that overhears all the traffic extracts the RSS from the packets and relays the measurements to a computer through UART for centralized processing. The readers are referred to [47] for a detailed description of the communication protocol. It is to be noted that the method of this paper can be generalized to any device capable of measuring the RSS including Wi-Fi, Bluetooth and RFID.
The experiments are conducted in an open indoor environment and in a downtown residential apartment. In both experiments, 20 nodes are deployed as illustrated in Figure 5. In the open environment, the nodes are set on top of podiums (≈0.9 m) and deployed around a 75 m 2 area. The size of the apartment is 82 m 2 and the nodes are deployed by the electric sockets so they could be powered from the mains. The walk-in closets did not have electric sockets on the exterior walls, so we decided to deploy one battery-powered node in each to ensure coverage of the entire apartment. These two nodes are located at Before the experiment, reference positions were defined and marked. During the experiment, the person's trajectory follows the imaginary lines between the markers. Once the target reaches a reference position, they stop, remain stationary for a few seconds, and then walk to the next reference position. During the experiment, the person is carrying a video camera. In post-processing, the RSS and video streams are synchronized, and the video is used to define the ground truth trajectory. In Section 6.4, the statistical significance of the tracking error is tested to assure that the generated trajectory is close to the ground truth.
The experiments in both environments are conducted with one, four and 16 frequency channels and the set of used channels are: C = {26}, C = {11, 16, 21, 26} and C = {11, . . . , 26}. In addition, three different trials are conducted with each channel number. The trials are approximately three minutes long and every reference position is visited at least once in each trial. In the following section, the experiments are referred to as Exi.j., where i indicates the experiment number and j the trial. Experiments 1-3 are conducted in the open environment and experiments 4-6 in the apartment. Furthermore, Ex1 and Ex4 use one channel, Ex2 and Ex5 four channels, and Ex3 and Ex6 all 16 frequency channels. In the apartment experiment, there are several co-existing Wi-Fi networks located in the coverage area, but the presented system can remain operational. The system is not particularly sensitive to occasional packet drops and frequency channel diversity partly mitigates interference issues. As an example, the packet reception rate is below 85% on the most congested channel and above 99% on channels that do not share the frequency band with Wi-Fi.
The imaging parameters used in the experiments are given in Table 1, whereas the parameters of the tracking algorithm are defined by the measurement model Θ = [µ, φ, λ, σ 2 ].
In the experiments, λ = 0.04 m is assumed to be constant unless otherwise stated, the measurement gain and variance are initialized using φ 0 = −5 dB and σ 2 0 = 1 dB 2 . In Section 6, the initialization of µ 0 is discussed. The only user-defined parameter in the EKF is the process noise value and it should be tuned to the actual motion. In this paper, q = 0.01 m 2 /s 3 which corresponds to an acceleration of a ≈ 1.8 m/s 2 for the considered system. The tracking filter is initialized when the person has reached the first reference position and is stationary. The filter is initialized using m 0 = x 0 (m) 0 (m/s) y 0 (m) 0 (m/s) T , where x 0 and y 0 are the center coordinates of the monitored area and P 0 = I 4 . To note, the ground truth position is never at x 0 y 0 T when the filter is initialized. Occupancy assessment is an important problem in DFLT [32] but for simplicity, we assume we know the time instances when the person enters and exits the monitored area.

Parameter Unit
Pixel Variance (9) σ 2 b 0.0005 (dB 2 ) Correlation distance (9) δ d 0.5 (m) Spatial decay rate (10) λ 0.04 (m) Pixel width (7) δ p 0.25 (m) Image threshold (12) γ 0.70 The filters are evaluated using the root-mean-square error (RMSE) which is defined as RMSE = √ MSE. The mean-squared error (MSE) is where K ≈ 62,000 is the total number of estimates in one trial, p k denotes the ground truth position, the hat accent indicates the estimate and · 2 2 the square of the Euclidean norm.

Experimental Results
Matlab code to run the tracking and localization, smoothing and parameter estimation algorithms presented in Sections 3 and 4, and the experimental data presented in Section 5 are available in [21]. The reader is referred to the associated readme file for an overall algorithmic description of the derivations presented in this paper and to the Matlab files for the actual implementation of the algorithms. The development efforts of this paper are experimentally validated in the following and benchmarked against existing solutions from literature. For now, µ 0 is calculated using a two-minute empty-room calibration period. From Section 6.3 onward, µ 0 is initialized without an empty-room calibration period.

EM with Existing DFLT Methods
In this section, it is shown that the EM algorithm can be used to enhance not only the performance of the proposed system, but also two de facto DFLT methods from literature. The first is RTI. The target is positioned as presented in Section 3.2, a standard Kalman filter (KF) is used for tracking and the RTSS is used for smoothing. The second method is a particle filter (PF). The implemented PF is a sequential importance resampling (SIR) filter with 1000 particles. The state estimate and covariance are calculated from the filtering distribution which is approximated by the set of particles and associated weights. A particle smoother could be implemented for approximating the smoothing distributions but for simplicity, the RTSS presented in Section 4.1 is used instead. A re-initialization procedure is required by the PF, because it is prone to diverge when the measurement model is inaccurate [3]. The PF is re-initialized, if the position error is larger than two meters, by drawing new particles from a uniform distribution within the monitored area and with zero velocity.
In Figure 6, RMSE of the different filters as a function of EM iteration number. As shown, the tracking performance is satisfactory with the initial parameter estimates (EM iteration 0) and all filters have an RMSE above one meter. It is to be noted that most DFLT systems are implemented similarly, i.e., an empty-room calibration period is used to estimate µ 0 and an educated guess is used for the other parameters. The empty-room data cannot be used to estimate the other parameters since they depend on the location of the person. However, they can be estimated using the state estimates after the person has entered the monitored area and moves around. In this paper, the EM algorithm based on Gaussian smoothing is used and with better parameter estimates, the tracking accuracy can be improved as we demonstrate in the following.
After the filtering recursion, the RTSS recursion starts from the last time step and proceeds backwards to the first time step. Thereafter, the E-step of the EM algorithm can be approximated using the smoothing distribution and the parameter estimates are obtained from the M-step in closed form. Using the new parameter estimates, the filtering recursion is started from the beginning. This iterative process improves the model parameter estimates and results in enhanced tracking accuracy. As shown in Figure 6, the RMSE decreases by 46-67%, depending on the filter. The results demonstrate that the implemented smoother and EM algorithm can also be used with other DFLT methods, and it is an effective method to improve system performance.

Parameter Estimation Algorithms
In this section, the EM algorithm is compared to the nonlinear least-squares (NLS) approach proposed in [22]. The parameter estimates are obtained by minimizing the cost function where h l (m s k , Θ) = µ l + φ l exp(∆ l,k /λ l ) is the nonlinear exponential model and λ l is now a parameter to be estimated. In this paper, a nonlinear least-squares solver based on the interior-reflective Newton method described in [48] is used to find the minimum of J(Θ) and thereafter, the ML estimate of σ 2 l is computed. The NLS approach provides freedom in the set of parameters that are estimated. In the following, we evaluate the NLS approach that estimates the following parameters Θ {j} : (i) Θ {1} = [µ, φ, σ 2 ], as proposed in this paper; (ii) Θ {2} = [µ, φ, λ], as proposed in [22]; and (iii) Θ {3} = [µ, φ, λ, σ 2 ], a system that estimates all measurement model parameters. The results are compared to the EM algorithm that estimates Θ {1} . We denote the parameter estimation algorithm and set of parameters simply as NLS(Θ {j} ). The RMSEs are illustrated in Figure 7 and the results imply that estimating Θ {1} yields the highest tracking accuracy whereas estimating Θ {2} the lowest. To examine this difference more closely, we concentrate on Ex3 and the NLS and calculate the average R 2 statistic, defined as andȳ l = 1 K ∑ K k=1 y l,k . The R 2 statistic measures how much of the observed variation in the mean can be explained by the model. For the two cases, R 2 = 17.0% and R 2 = 19.1% for NLS(Θ {1} ) and NLS(Θ {2} ) in corresponding order, meaning that estimating Θ {2} explains the mean of the data more accurately, but the difference is only 2.1%. Calculating the Kullback-Leibler Divergence (KLD) yields 0.04 and 0.52 for Θ {1} and Θ {2} in respective order. As the KLD indicates, Θ {2} is unable to account for the noise in the data and improved estimators can be developed when better noise models are available. Thus, estimating σ 2 rather than λ has a significantly higher impact on tracking accuracy. It is to be noted that EM(Θ {1} ) and NLS(Θ {1} ) yield comparative performance and small differences are expected, for example, due to the termination rule of the optimization method. Interestingly, NLS(Θ {3} ) has a higher RMSE than NLS(Θ {1} ). This is either caused by over fitting the model or then the optimization algorithm converges to a local minimum. In the measurement model, φ and λ are coupled and the optimization algorithm must solve for these simultaneously which can be problematic. The main benefit of the proposed EM algorithm is that it can be solved in closed form using simple arithmetic operations, whereas the NLS approach requires a solver for the nonlinear optimization problem. In practice, the estimates only require computing two vector products (B and D) with complexity O(K 2 ), and calculating G with complexity O(K 2 + n 3 K), where n is the state dimension and K the number of measurements. As an example, for three minutes of experimental data and using the initial parameter estimates, the computation time of the parameter estimation algorithms in experiment Ex3 are: in respective order. The results are obtained using a Matlab implementation and a computer equipped with a 2.60 GHz Intel Core i7-8850H processor and 32 GB of RAM. As demonstrated by the results, the EM algorithm is computationally very efficient, up to 18 times faster than NLS. It is to be noted that the computation time of NLS has a significant dependence on the parameter values that are used to initialize the optimization algorithm. For example, the computation time of NLS(Θ {3} ) is 13.80 s during the last parameter estimation iteration. Additionally, the link number has an impact since it defines the number of times NLS is called. As an example, the computation time in Ex1.1. and NLS(Θ {3} ) is 3.12 s, which is significantly shorter than in Ex3.1. because the experiment only uses one frequency channel.

System Comparison
In this section, the proposed system is benchmarked against an adaptive radio tomographic imaging (ARTI) system [22]. ARTI is an imaging method that estimates µ and φ online, smoothing is used to enhance the image and state estimates, and NLS is used for estimating φ and λ. In the experiments, both systems are initialized without any prior information of the RSS, model parameters or location of the person. ARTI has an online calibration unit to estimate the reference RSS and the system is functional from the very beginning. The proposed system does not have such a feature and we use the online calibration unit of ARTI to estimate µ during the first filtering recursion and then the unit is disabled during the subsequent iterations. It is to be noted that µ could be initialized in various ways, but for matter of fairness we use the same method as ARTI uses.
The results are summarized in Table 2 and for each experiment, the results are averaged over the three trials. As shown, the proposed algorithm results in superior performance, an average decrease of 42% in the RMSE with respect to ARTI. For ARTI the estimates are most of the time accurate, but in certain positions, the location estimate is widely off (the skewness is 5 and kurtosis is 37 indicating that the distribution has a positive skew, and it is heavy tailed). The main reason for the large position errors is that a link can measure a really large RSS change when the person is not on the link line, the straight imaginary line between the TX and RX. When using imaging methods, this one link will dominate over the other links and the person will be localized in between the wrong TX-RX pair.
The proposed system is not as vulnerable to such outliers (skewness is 2 and kurtosis is 10) because of the implemented measurement selection logic which discards RTI position estimates with abnormally large errors. The performance difference between ARTI and EKF can also be explained with the set of parameters that are estimated by the systems. ARTI estimates Θ {2} = [µ, φ, λ] and this limits the achievable accuracy of the system as discussed in the previous section. To support this claim, in Figure 6 it is shown that an RTI solution together with EM almost achieves the same accuracy as the proposed solution.

System Performance over Time
Next, we demonstrate that the proposed system can maintain its high accuracy over time. The conducted trials are actually snippets from a longer experiment. The entire experiment contains the three trials explained before and a five-minute period when the person randomly walks inside the monitored area. In between the occupancy time periods, the person leaves the area for two minutes at a time. The five-minute period takes place before the trials, and we will run ten iterations of the EM algorithm using data from this period. Then, the obtained parameter estimates are used the next time the person enters the area. After each trial, the EM algorithm is used once to recalculate the model parameter estimates. The results are summarized in Table 3 and in every experiment, the tracking accuracy remains high throughout the different trials and there is no indication that the RMSE increases. This implies that the proposed system is suitable for estimating the model parameters without requiring human intervention and for maintaining high tracking accuracy over time. The ground truth trajectory together with the coordinate estimates is illustrated in Figure 8 for Ex6.3. Please note that the covariance of RTI position estimates changes from frame to frame and, therefore, the pink area which illustrates the 3σ confidence interval is not constant.  The described procedure is one of the possibilities how the proposed system would be used in practice, i.e., the parameters would be estimated at regular intervals or once the person has covered enough distance. However, there is a downside to the EM algorithm. It does not account for prior information, and it computes the ML estimates of the parameters from the data that is used as input. As an example, the data from the five-minute period is forgotten when the parameters are re-estimated after the first trial. This is an issue that must be solved for systems that are deployed over an extended period of time. One alternative is to compute the maximum a posteriori (MAP) estimates which can be done in practice by maximizing Q(Θ, Θ (i) ) + log p(Θ) at the M-step instead of the plain Q(Θ, Θ (i) ) [36]. The prior information is included in the MAP estimate via the additional term log p(Θ).
The ground truth trajectory is reconstructed from the video recording. When the person is stationary, the ground truth locations are accurate because the reference positions were measured precisely with a laser rangefinder, and it is easy to extract these time instances from the video. When the person is moving, the ground truth can contain small errors because the video and measurements cannot be perfectly synchronized. Furthermore, the person does not move exactly with constant velocity. Let us assume that the ground truth trajectory is reconstructed accurately and let the null hypothesis be that the RMSE is the same when the person is stationary and moving. Then, we can test the statistical significance of the result in determine whether the null hypothesis should be rejected or retained. The RMSE is 32.8 cm when the person is standing still and 33.0 cm when moving. The t-test statistic of the independent two-sample t-test equals 0.0465 and the critical value is 2.03 with a 5% significance level. Since the statistic is lower than the critical value, the null hypothesis remains valid and the RMSE for stationary and moving periods can be considered the same. The result indicates that the ground truth trajectory has been accurately reconstructed.

Simulations
Lastly, we want to validate the development efforts of this paper numerically using simulations. Thus, the performance of the proposed system and ARTI are numerically analyzed using a simulation scenario which replicates Experiment 3. In total, 100 Monte Carlo simulations are performed and for each run, the model parameters are randomly drawn. The model parameters used in the simulation are drawn from: a Gaussian distribution, µ ∼ N (−62.80, 7.50), non-standardized Student's t-distribution φ ∼ T (−2.14, 3.60, 4.59), a uniform distribution λ ∼ U (0.01, 0.13) and a log-normal distribution σ 2 ∼ L(0.79, 0.88). The exact distributions of the model parameters are unknown, but the used ones provide a functional fit and they resemble the empirical distributions obtained using data from the open environment experiments. In the following, the RMSE is evaluated with respect to the posterior Cramér-Rao bound (PCRB) of RSS-based DFLT [3]. In addition, the RMSE of the parameter estimates are examined.
In Figure 9, RMSE of the two systems as a function of parameter estimation iteration number is illustrated. With the initial parameter estimates, see iteration number zero in Figure 9, ARTI achieves a lower RMSE because µ and φ are estimated online during the filtering recursion. However, the proposed system outperforms ARTI after the parameters have been estimated by the EM and NLS algorithms. As illustrated in the figure, the EKF converges much closer to the PCRB than ARTI. More quantitatively, at iteration number five, the PCRB is 3.7 cm whereas the RMSE of the EKF and ARTI are 7.1 cm and 16.7 cm in respective order, a 57% decrease in tracking error in favor of the EKF. The EKF achieves higher tracking accuracy due to two reasons. First, the EKF-based tracking algorithm is more accurate than the KF-based tracking algorithm of ARTI. With improved tracking performance the parameter estimates are more accurate which improves the tracking performance even further. As shown in Table 4, the RMSE of the parameter estimates for the EKF are significantly lower for µ, φ and σ 2 . The second reason is that an accurate estimate of σ 2 rather than λ has a significantly higher impact on tracking accuracy as discussed in Section 6.2. As tabulated in Table 4, the RMSE of λ for ARTI and the EKF are 0.0365 m and 0.0433 m in respective order, only a 19% increase in RMSE when it is not estimated by the proposed system. Respectively, the RMSE of σ 2 decreases by 85% when estimated by the EM algorithm and as a result, improved estimators can be developed when better noise models are available.

Conclusions
The work in this paper addresses three fundamental challenges in DFLT: first, having an accurate model of the RSS as a function of target and transceiver positions; second, estimating the parameters of the model without requiring calibration or supervised training; and third, maintaining that model over time without requiring recalibration of the system. These problems are tackled by developing a system for estimating the RSS model parameters and the target's arbitrary trajectory. In the paper, the model parameters are estimated using an EM algorithm based on Gaussian smoothing and a novel localizationand-tracking system is presented to fully use the EM algorithm's potential. The system is validated using 18 different measurement data sets from two different environments. The results suggest that high tracking accuracy can be achieved without using calibration data. With respect to another adaptive DFLT system, it is demonstrated that the proposed system reduces the RMSE by 42%, while the parameter estimation algorithm is up to 18 times faster to compute. The developments of this paper streamline the deployment and maintenance needs of DFLT systems without sacrificing accuracy.

Appendix A. EM Algorithm
The parameter updates (27) for the linear-in-parameters measurement model (26) are derived in the following. In state-space models, Q(Θ, Θ (i) ) can be decomposed to [36][37][38] Q(Θ, Θ (i) ) = K ∑ k=1 E{log p(y k | x k , Θ) | y 1:K } by making use of the Markov property of the state sequence and conditional independence of the measurements. Noting that in DFLT, the dynamic model p(x k | x k−1 ) and initial distribution p(x 0 ) are independent of the unknown parameters, maximizing (A1) is equivalent to maximizing the first term only, i.e., maximizing (25). To simplify the notation, the dependence of h l,k on x k is not explicitly stated in the following. The log-likelihood of the first term is To estimate the parameters, (25) is approximated using the smoothing distribution provided by the Gaussian smoother in Section 4.1 and given by p(x k | y 1:K , Θ (i) ) ≈ N (x k ; m s k , P s k ).
The maximization step is computed by setting the gradient with respect to θ l to zero, i.e., (A3) Using the notation in (28) and solving (A3) for θ l yieldŝ Similarly, for σ 2 l we obtain E h T l,k h l,k | y 1:K θ l = 0. (A5) Using the notation in (28), substituting θ l to (A5) and solving for σ 2 l yieldŝ Finally, note that the optimalθ l does not depend onσ 2 l so that they can be sequentially optimized by first solving forθ l and then substituting the value into the estimate ofσ 2 l .

Data Availability Statement:
The experiments conducted in this paper, together with Matlab code to run the presented filtering, smoothing and EM algorithms are made publicly available and are published in [21].