Next Article in Journal
Suspended-Core Microstructured Polymer Optical Fibers and Potential Applications in Sensing
Previous Article in Journal
Spatio-Temporal Mapping of L-Band Microwave Emission on a Heterogeneous Area with ELBARA III Passive Radiometer
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

People Counting by Dense WiFi MIMO Networks: Channel Features and Machine Learning Algorithms †

1
National Research Council of Italy (CNR), Institute of Electronics, Computer and Telecommunication Engineering (IEIIT), Piazza Leonardo da Vinci 32, 20133 Milano, Italy
2
Politecnico di Milano, Department of Management, Economics and Industrial Engineering (DIG), Piazza Leonardo da Vinci 32, 20133 Milano, Italy
*
Author to whom correspondence should be addressed.
This paper is an extended version of our paper published in Kianoush, S.; Savazzi, S.; Nicoli, M. Device-free Crowd Sensing in Dense WiFi MIMO Networks: Channel Features and Machine Learning Tools. In Proceedings of the 2018 15th Workshop on Positioning, Navigation and Communications (WPNC’18), Bremen, Germany, 25–26 October 2018; pp. 1–6.
Sensors 2019, 19(16), 3450; https://doi.org/10.3390/s19163450
Submission received: 27 May 2019 / Revised: 24 July 2019 / Accepted: 31 July 2019 / Published: 7 August 2019
(This article belongs to the Section Sensor Networks)

Abstract

:
Subject counting systems are extensively used in ambient intelligence applications, such as smart home, smart building and smart retail scenarios. In this paper, we investigate the problem of transforming an unmodified WiFi radio infrastructure into a flexible sensing system for passive subject counting. We first introduce the multi-dimensional channel features that capture the subject presence. Then, we compare Bayesian and neural network based machine learning tools specialized for subject discrimination and counting. Ensemble classification is used to leverage space-frequency diversity and combine learning tools trained with different channel features. A combination of multiple models is shown to improve the counting accuracy. System design is based on a dense network of WiFi devices equipped with multiple antennas. Experimental validation is conducted in an indoor space featuring up to five moving people. Real-time computing and practical solutions for cloud migration are also considered. The proposed approach for passive counting gives detection results with 99% average accuracy.

Graphical Abstract

1. Introduction

Internet of Things (IoT) and new ubiquitous connectivity paradigms beyond 5G have created unprecedented dynamics for opportunistic sensing by exploiting low-cost radio devices [1,2]. Such new sensing modalities leverage the cross-fertilization of computing and communication technologies and are typically achieved through the transformation of natural, stray or ambient radio frequency (RF) radiation into new sensing ways [3] to probe the environment and the people moving inside it. In particular, the use of devices with built-in radio modems (e.g., WiFi or cellular-enabled devices) and Multiple–Input Multiple–Output (MIMO) capabilities, as envisioned by 5G for device-free (or passive) body motion sensing [1], is becoming attractive in many fields, such as smart homes and assisted living applications. Device-free radio sensing provides privacy-preserving monitoring with increased robustness to environmental conditions with respect to video-based systems [4,5], such as in case of fire or variable light conditions.
In this paper, we address the specific problem of passive people counting in an indoor space covered by a network of MIMO WiFi devices. The proposed counting system tracks and classifies the perturbations of the electromagnetic field maintained by a WiFi network to detect and discriminate multiple subjects (namely the targets) that cause such perturbations. As depicted in the scenario of Figure 1, target detection and counting is based on real-time processing of the radio channel that is measured by a dense network of MIMO WiFi devices configured as transmitters (TXs) and receivers (RXs) according to the specific wireless protocol.

Contributions

Capturing crowd dynamics (i.e., subject counting and tracking) by signal analytics in IoT networks is an emerging topic in research as well as in practical implementation of crowd management systems for smart city applications [6]. By using video footage, computer vision approaches allow accurate crowd monitoring. However, deploying a camera network is costly and inhibits scalability for events that happen infrequently. In addition, video systems have often different specific installation requirements and privacy constraints, as well. On the other hand, supported by a large installation basis, currently available IoT devices offer an excellent instrumentation ground for crowd monitoring, thanks to the multitude of available sensors and radio interfaces (e.g., Bluetooth, WiFi, Thread and ZigBee). Passive crowd monitoring by wireless networks is a new emerging topic of research but some first experimental works can be found in the literature. In reference [7], the authors proposed an algorithm to localize and count multiple targets. In order to address the non-linearity of the impact of multiple subjects on the radio signals, they proposed a successive cancellation algorithm to iteratively determine the number of subjects by modelling the indoor human trajectories as a state transition process, using an indoor human mobility model and integrating all information into a Conditional Random Field (CRF) to simultaneously localize and count the subjects. In reference [8], the authors proposed a non-image people counting system based on a Deep Neural Network (DNN) model using fine-grained physical-layer wireless signatures such as WiFi CSI (Channel State Information) data. Real test-bed experiments showed that the proposed system can achieve an average correct classification rate up to 88 % when estimating the crowd size in indoor scenarios with up to nine people. In reference [9], the authors suggested a people counting algorithm using Impulse Radio Ultra-WideBand (IR-UWB) radar sensors, equipped with antennas which have narrow beam width. The system performances have been validated in a representative environment, showing results with large accuracy. However, the use of dedicated UWB radar sensors is still an expensive choice for applications in massive IoT smart spaces, where thousands of IoT devices are deployed.
This paper proposes the transformation of a dense MIMO WiFi network into a passive crowd sensing system by exploiting Machine Learning (ML) tools that process multi-dimensional CSI extracted from different PHY (PHYsical layer) frames, MIMO antennas and sub-carriers. Dense MIMO networks pose remarkable scalability challenges in signal processing due to the increase of the dimensions of the CSI data that are usually represented as tensors or multi-dimensional arrays. Therefore, unlike previous approaches, we propose here ML tools that are designed to fully exploit the statistical coherence of the CSI data over this multi-dimensional space domain (i.e., antenna, frequency, time) in order to extract compact but maximally informative CSI features. In other words, CSI data tensors are pre-processed to produce custom-designed features that are organized in accordance to the meaning of the underlying data dimensionality.
Based on such premise, ensemble classification and ML approaches based on different multi-dimensional features are compared, such as Bayesian techniques, recurrent Long-Short Term Memory (LSTM) neural network and Feed-Forward Neural Network (FF-NN) models [10]. Finally, considering the problem of scalability to large monitored areas (e.g., system deployed in buildings or multiple rooms), we consider the problem of transferring part of the data analytics to an edge-cloud system to support real-time monitoring [11].
The paper is organized as follows: Section 2 discusses the selection of multi-dimensional radio channel features according to the space-frequency MIMO configurations. Section 3 presents the computing architecture including ML tools and the edge-cloud computation system, while Section 4 describes the experimental validation setup and presents some numerical results. Finally, some conclusions are drawn in Section 5.

2. CSI Features for Subject Counting

We consider the problem of extracting information about the number C 0 , 1 , , N N of human targets moving in a monitored area X by processing CSI data measured over a broadband multi-carrier MIMO network deployed in the area. As illustrated in Figure 1, the network is composed of N t transmitting and N r receiving devices, each equipped with M antennas. Modulation is based on Orthogonal Frequency-Division Multiplexing (OFDM). The overall network MIMO system is thus characterized by M t = N t M transmitting and M r = N r M receiving antennas. Any target moving inside the area modifies the radio propagation between the devices and thus is expected to affect the MIMO CSI in terms of mean power (due to body induced attenuation), fading (due to body movements) and space-frequency selectivity (due to the alteration of the multipath dispersion).
We propose to observe the baseband channel response over the space-frequency-time domains to capture the overall impact and extract all available information about the presence and number of targets in the area. We assume that the MIMO channel can be estimated at the receiver side at the discrete time instants t T = { 1 , 2 , , T } , spanning over T OFDM symbols (or frames). These observations are expected to embed a characteristic footprint of the channel variations induced by people movements in the monitored area.

2.1. MIMO-OFDM Channel Response and CSI Data Sets

Using a conventional channel estimation method based on standard-defined pilot sub-carrier arrangement [12], the MIMO channel response is monitored in the frequency domain over K pilot sub-carriers f F = { f 1 , f 2 , , f K } . OFDM training symbols for channel estimation are usually multiplexed with information symbols in each data frame according to the adopted WiFi standard. The complex baseband channel response observed on frequency f F , over the link = ( a t , a r ) between the transmitting antenna a t = 1 , , M t and receiving antenna a r = 1 , , M r , at time t T , is denoted as H f , , t , where the index L = { 1 , 2 , , L } ranges over the L = M t M r radio links.
Focusing on subject counting, we highlight the channel response in the space and frequency domains as the input for feature processing. In particular, we define the L × 1 space-domain channel response for the sub-carrier f as
H t f = [ H f , , t ] L ,
while the K × 1 frequency-domain response for the link is defined as
H t = [ H f , , t ] f F .
Finally, we define the overall space-frequency CSI as H t = [ H f , , t ] f F , L . These CSI sets are expected to embed information about the number of targets C and are thus used in the next sections to extract the features for target counting.

2.2. Space-Frequency Domain CSI Features

In what follows, features are built according to the CSI measured over the space (1) and the frequency (2) domains. The CSI strength observed at time t T on sub-carrier f F and link L , is defined in dB scale as s f , , t = | H f , , t | dB 2 . Relevant features can be extracted from these CSI data for counting and crowd size classification. In particular, body movements are expected to modify the statistical properties of the CSI in both the space and frequency domains, such as the strength moments evaluated as average over time in a given space-frequency sample (e.g., the mean and variance, related to the body induced attenuation and fading), as well as the correlation over the sub-carrier frequencies (accounting for the crowd impact on the multipath power-delay profile) and over the antennas (accounting for the impact on the MIMO power-space profile). Thereby, in the following, we consider the statistics of the overall CSI power space-frequency profile. The frequency-domain CSI strength vector for link at time t is defined as
s , t = [ s f 1 , , t s f K , , t ] T
which collects the measurements of the channel response over all the sub-carriers for the link L and, similarly, the space-domain CSI vector
s f , t = [ s f , 1 , t s f , L , t ] T
collecting the channel measurements over all active MIMO links, for the pilot sub-carrier f F .
Considering a single channel strength sample s f , , t extracted from the frequency or space domain vectors, we choose four distinctive statistical indicators that provide a representation of the corresponding probability function s f , , t Pr [ s f , , t ] , namely
μ f , = E t s f , , t , σ f , = E t s f , , t - μ f , 2 , ζ f , = E t s f , , t - μ f , σ f , 3 , κ f , = E t s f , , t - μ f , σ f , 4 .
where μ f , and σ f , are the mean and the standard deviation, while ζ f , and κ f , are the skewness (i.e., third moment) and kurtosis (i.e., fourth moment) coefficients herein evaluated over consecutive time samples (i.e., WiFi frames). In particular, the skewness indicator captures the distance between mean and mode values, while kurtosis measures the concentration of the probability mass in the tails of the distribution.
Both skewness and kurtosis features are affected by the number of targets C as illustrated in Figure 2. The figure shows the empirical distribution of the CSI strength measurements p f , ( s ) = Pr [ s f , , t ] built by observing the statistics over a number T of consecutive MIMO WiFi frames t, for the link index 5 ( = 5 ) and the sub-carrier index 5 ( f = f 5 ). Corresponding features include the mean μ , the standard deviation σ , the skewness ζ and the kurtosis κ . As shown in the figure, the values of the CSI indicators change with the number of the subjects inside the area. For example, mean, standard deviation, kurtosis and skewness values are - 52 . 9 dBm, 1 dB, - 0 . 14 dB 3 , and 2 . 8 dB 4 for C = 1 and - 55 . 5 dBm, 4 . 1 dB, - 2 . 24 dB 3 , and 13 . 2 dB 4 for C = 5 , respectively.
A feature embodying the moments of all orders is the CSI distribution. Figure 3 and Figure 4 show the CSI distribution versus links p ( s ) = Pr [ s f , , t = s | ] , for L , and versus frequency p f ( s ) = Pr [ s f , , t = s | f ] , for f F , obtained in the form of histogram from real measurements taken in an indoor environment, with number of targets ranging from C = 0 (i.e., empty environment) up to C = N = 5 . In particular, Figure 3 shows the CSI distribution p ( s ) versus link built by aggregating the samples over the frequency domain, while Figure 4 represents the distribution p f ( s ) versus sub-carrier f by aggregating over the links. As it can be seen from these figures, the CSI distributions in both space and frequency domains are sensitive to the number of targets. Statistics change more significantly for 0 C 2 , whereas, for 2 < C 5 , the dependence due to the number of targets is less evident but still present and can be fully exploited by using the CSI distribution to build these features.
In what follows, real-time subject counting is based on the analysis of time series of T consecutive MIMO WiFi frames. The four statistical indicators in (5) are thus evaluated over both frequency and space domains, resorting to temporal sequences, or features, of T samples or frames. Notice that, as highlighted in Section 3 and Section 4, the number T of frames per feature depends on the learning model. For the frequency domain, we define four time series vectors
μ = [ μ 1 , μ T , ] T , σ = [ σ 1 , σ T , ] T , ζ = [ ζ 1 , ζ T , ] T , κ = [ κ 1 , κ T , ] T .
where μ t , is defined as μ t , = E f [ s f , , t ] while σ t , , ζ t , , and κ t , are defined similarly as in (5) but replacing time index with the frequency one while expectations are made over f F . For space domain, the corresponding features are described by the vectors
μ f = [ μ 1 , f μ T , f ] T , σ f = [ σ 1 , f σ T , f ] T , ζ f = [ ζ 1 , f ζ T , f ] T , κ f = [ κ 1 , f κ T , f ] T .
where μ t , f is defined as μ t , f = E [ s f , , t ] while μ t , f , σ t , f , ζ t , f , and κ t , f are defined as in (5), by replacing the time index with the link one while expectations are made over L . Features (6) and (7) can be also represented as the sets [ μ t , , σ t , , ζ t , , κ t , ] t = 1 T and [ μ t , f , σ t , f , ζ t , f , κ t , f ] t = 1 T , respectively. Both feature sets (6) and (7) are used an input to machine learning tools detailed in Section 3 and Section 4.
As an example, Figure 5 shows the 2D histograms of the CSI mean feature μ t , f and μ t , in the space and frequency domains as well, for varying number C of people in the range 0 C 5 . As shown in the figure, the distribution properties are substantially affected by the number of subjects. Again, the sensitivity is higher for C in the range 0 C 2 . Similar effects (not shown here) are observed considering the deviations, skewness and kurtosis in (6) and (7).
The impact of body movements on higher order CSI moments over multiple antennas (space domain) and sub-carriers (frequency domain) is still unexplored in the literature. In the following sections, we deepen the topic by proving that, compared to the unobstructed environment, the presence of the target(s) affects both the space- and frequency-domain CSI moments and such alterations can be exploited as reliable features for target counting.

3. Computing Architecture and Processing Tools for Subject Counting

In this section, we highlight the computing architecture and the CSI processing tools that are implemented for real-time subject counting. The proposed system relies on computing and caching stages (see Figure 6) that run inside the WiFi Access Point (AP) device, serving as edge node and inside a computing device, deployed remotely that serves as cloud node. The WiFi AP is in charge of the segmentation and extraction of the CSI features (Section 2) obtained from the CSI measurements performed by the deployed WiFi devices. The cloud node processes the features received from the WiFi AP to infer the number of subjects and, possibly, forwards this information via the MQTT (Message Queuing Telemetry Transport) protocol [13] to different subscribers or applications [14]. As depicted in the same figure, the WiFi AP, serving as edge node, exposes a set of REpresentational State Transfer (REST) Application Programming Interfaces (APIs) that are used by the WiFi devices to send the CSI data (via HTTP POSTs) encoded in a Java Script Object Notation (JSON) format [15]. The AP node extracts the CSI features used for subject counting (Section 2) and publishes such features via the MQTT protocol. Finally, the cloud device, acting as MQTT subscriber, processes such features for inference and, in turn, publishes the detected targets results.
In what follows, we discuss the ML tools for CSI feature processing that are implemented in the cloud to perform subject counting.

3.1. Machine Learning Tools for CSI Feature Processing

ML tools [16] are here adopted for real-time processing of the CSI features extracted as in Section 2 from multidimensional channel data observed over time, frequency and links to provide an estimate C ^ of the number of subjects C moving in the monitored area. The learning process benefits from the combination of multiple models that are individually trained and optimized based on the input time series of CSI features defined in (6) and (7) in both frequency and space domain. More specifically, as highlighted in Figure 7, we construct ensembles that independently train J individual Neural Network (NN) models φ j with j = 1 , , J . Ensemble approaches for learning have two major components, that is, a method for creating individual NNs and a method for combining NNs, typically by weighted or unweighted, voting [17]. We consider two NN types, namely the LSTM and the FF-NN models. Each network is optimized, by supervised learning, for a specific feature vector. The input for each model φ j thus consists of space-frequency domain features (6) and (7) obtained in a specific time frame. Ensemble methods combine multiple learning models φ j with the goal of improving the robustness over a single estimator [18]. As highlighted in Section 4, considering the four CSI feature vectors corresponding to the first four statistical moments of the input CSI data shown in Section 2, the proposed ensemble model exploits the output of J classification algorithms. Popular combining architectures for constructing ensemble are the bagging [17] and boosting [19] algorithms. Here, a bagging approach is adopted while unweighted major voting is chosen to combine the outputs C ^ ( φ j ) of the individual NN classifiers to update the new detection estimate. In Section 4, optimal combinations of NNs to maximize the system performances are investigated as well.

3.2. Kullback-Leibler Divergence for Processing of CSI Distribution Features

The Kullback-Leibler (KL) divergence [20] is commonly used to measure the similarity between two distributions. This metric is here employed to design a further tool for subject counting (in addition to ML tools proposed in the previous section), based on the evaluation of the similarity between the CSI distributions built from the training phase and the input CSI observations. We define as q f , ( s | C ) = Pr [ s f , , t = s | C ] the CSI distribution obtained from training on sub-carrier f and link (see Section 4), when C subjects are moving in the space. For C = 1 , , N targets, the subject counting process consists in computing N KL divergence measures (or likelihoods, in the Gaussian case) between the observed input distribution p f , ( s ) = Pr [ s f , , t = s ] and the N distributions q f , ( s | C ) collected at training stage for C = 1 , , N . In particular, the KL divergence for C target(s) is defined as
DKL f , ( C ) = D K L ( p f , ( s ) | | q f , ( s | C ) ) = s p f , ( s ) log p f , ( s ) q f , ( s | C )
Divergence (8) can be used to define feature vectors, similarly to what done for mean, standard deviation, skewness and kurtosis, that are used as input to learning tools (Section 4). In particular, considering processing in the frequency domain, the minimum divergence estimator of the number of targets C ^ can be written as
C ^ = argmin C = 1 , , N k = 1 K DKL f k , ( C )
and, similarly, in the space domain, it is
C ^ f = argmin C = 1 , , N = 1 L DKL f , ( C ) .
For both cases, the final decision is based on majority voting on C , in frequency and C f , in space domain, respectively.

4. Subject Counting: Results and Discussions

The experimental activities have been conducted inside the indoor lab environment shown in Figure 8. We employed a network of 10 MIMO WiFi devices that have been distributed over a monitored area having size 6 × 4 m 2 . A single MIMO WiFi device (i.e., N t = 1 ) is acting as TX while N r = 9 MIMO WiFi devices are configured as RXs; each MIMO WiFi device is equipped with M = 3 antennas. Devices are deployed in the monitored area to capture the target presence at different positions. Usually, the sensing network is not regularly deployed around the perimeter of the monitored area (as also proposed in the experimental setup); however, clustering of nodes might affect the performance, as analyzed in reference [21] for localization purposes. Although not explicitly addressed in this paper, node layout can be optimized during a pre-deployment stage by using, for example, a Cramer-Rao bound based analysis to evaluate the impact of the geometric factor on the inference system performance [21]. This experimental section instead focuses on system validation in a real environment where the deployment of the sensors is assigned and node optimization is not allowed. Deployment optimization to improve the counting accuracy is however an interesting and open problem, also considering that passive detection in large buildings is typically obtained by scaling up the number of MIMO WiFi devices (for dense deployments). In the proposed tests, all MIMO WiFi devices are configured in monitor mode and working in the 5.32 GHz band (i.e., WiFi band 2, channel 64, OFDM symbol sub-carrier spacing 312.5 kHz and nominal bandwidth equal to 20 MHz). The monitor mode allows the receivers to observe the CSI values on the considered channel without explicit IP handshaking procedures. The TX device is programmed to inject (i.e., transmit) custom IEEE 802.11n PHY Protocol Data Units (PPDU) structured as standard High-Throughput (HT) greenfield WiFi format [22] including preamble, MAC addresses, header and payload: injected frames are sent at regular time intervals of 10 ms. In our tests, the TX device acts as an access point while the RX devices are collecting and measuring CSI reports. Modified chipset firmware and kernel [23] have been used to obtain the CSI samples of received IEEE 802.11n data frames. The adopted chipset is the Intel Wireless Link 5300 working as a MIMO-OFDM baseband modem. The modified driver allows to extract the standard CSI reports for uni-cast/broadcast packets. CSI reports are organized into a JSON format as described in Figure 6. Each JSON object tracks the CSI data set, organized by frequency, antenna and time frames, the device identifiers and the timestamps.
As shown in Figure 8, the MIMO WiFi devices are arranged to cover the whole monitored area and to capture signals obstructed by moving objects. The overall number of TX and RX antennas is M t = 3 and M r = 27 , respectively. CSI data are collected over K = 30 sub-carriers. Up to N = 5 people are moving inside the room across the LOS (Line Of Sight) as shown in Figure 8. The LOS path indicates the link path, where TX and RX are facing in straight line.

4.1. Classifier tools for Subject Counting

In our experimental tests, the crowd size increases from C = 1 up to C = 5 people while body motion patterns are performed inside the monitored area as shown in Figure 8. In particular, in the proposed set-up, 1 C 5 people enter the area one by one, staying/wandering for 10 s across the LOS path and then leaving sequentially one by one. Thus, up to N = 5 people may occupy simultaneously the monitored area. To detect the presence and estimate the number of people, we exploit state-of-the-art classifiers that employ the first two statistical channel strength moments as CSI features, namely the mean and standard deviations. In particular, we compare the performance of FF-NN, LSTM and maximum likelihood approaches [24] implemented separately, as routinely done in device-free localization (DFL) applications. We evaluate the performance in terms of accuracy, defined as the probability of correct classification. In addition, we assess also the performance of the KL divergence minimization approach (Section 3.2) that can be interpreted as a generalization of the maximum likelihood techniques, under non-Gaussian assumptions.
Table 1 shows the accuracy results for the maximum likelihood approach. The CSI distribution p f , ( s ) is approximated here as normal with mean and standard deviation terms altered by the presence of multiple targets. Numerical results show that despite the system can accurately discriminate between C = 1 and C = 2 targets, when the crowd size increases the counting accuracy performance drops to 0 . 55 and 0 . 30 , respectively.
In Table 2, we assess the performance of the subject counting system using the KL divergence minimization. In particular, during the training phase, we collect CSI measurements from L = 27 links and K = 30 sub-carriers and extract the corresponding distributions q f , ( s | C ) for varying number of targets C = 1 , , 5 . Real-time processing is then implemented separately in the space (10) and frequency domain (9) by choosing the minimum divergence with respect to the observed (input) distribution p f , ( s ) . Table 2 summarizes the accuracy performance obtained in the space and frequency domain processing, respectively. As seen previously according to the maximum likelihood approach, we observe a significant performance degradation when the number of targets increases.
Figure 9 and Figure 10 represent the confusion matrix results using the mean and standard deviation CSI features, in the space-frequency domain, when FF-NN (Figure 9) and LSTM (Figure 10) networks are chosen as classifiers, respectively. Confusion matrices highlight the average accuracy, as well as the precision and recall figures for varying number of targets 0 C 5 . Compared with the previous techniques, the average accuracy improves. Moreover, space domain CSI features (7) provide some advantages compared with frequency domain terms. In the following section, an ensemble classifier is adopted to combine multiple CSI features. In addition, we assess also the inclusion of skewness and kurtosis metrics (5) to further improve the system performances.

4.2. Ensemble Learning and Bagging: Channel Features and Comparative Analysis

In this section, we assess the performances of an ensemble method (Figure 7) to combine different models with the goal of improving the robustness over the single estimators highlighted in the previous sections. In particular, the proposed ensemble model exploits the output of different classification algorithms, separately trained, that use four different CSI features as inputs, namely the mean, standard deviation, skewness and kurtosis (5). Table 3 shows the accuracy results for the same layout of Figure 8: it compares four CSI features: mean, standard deviation, skewness and kurtosis, using an ensemble model that implements either FF-NN or LSTM classifiers in each branch φ j . In particular, the FF-NN consists of one hidden layer of 10 nodes and takes as input either the space or the frequency domain CSI features obtained from T = 50 WiFi frames. Likewise, the LSTM network consists of 100 hidden units and extracts sequences of CSI features using a moving window of T = 8 consecutive WiFi frames.
As highlighted in the previous sections, the ensemble model learns a classifier (FF-NN or LSTM) on different training sets, namely different CSI features and finally combines the results by voting. Multiple FF-NN and LSTM networks are thus trained using different CSI features and a supervised approach. For both FF-NN and LSTM, 80% of data is used for training, 10% for validation and 10% for testing. The results summarized in Table 3 show that an ensemble model, based on J = 11 parallel LSTM classifier networks, provides the best performance with 99% accuracy. Regarding the space-frequency diversity for target counting, the use of space-domain features provides more accurate results (regardless of the deployed classifier, that is, FF-NN or LSTM). For example, an ensemble model that combines J = 11 LSTM classifiers and uses statistical average features provides an accuracy of 99% in the space domain. The accuracy drops to 83% when using frequency domain features (and the same ML classifiers). FF-NN classifier networks obtain an accuracy of 95% using space domain features and drops to 81% with frequency domain ones. Therefore, these results confirm that space domain CSI features are more effective in capturing the multipath effects induced by multiple targets. Also, comparing various features used for target counting, the experiments confirm that combining features in both space and frequency domain provides more accurate results.

5. Conclusions

In this paper, we proposed device-free target counting using physical layer channel information from a WiFI-MIMO network in the space-frequency domain. Multidimensional statistical channel features are introduced and an ensemble classifier based on machine learning tools is detailed to efficiently use the multidimensional CSI data for target counting. Cloud computing is proposed to collect and classify the channel features in real-time for target counting. Experimental tests conducted to validate the proposed approach performance in indoor environments with crowd size up to N = 5 people, give a classification accuracy of about 99% using LSTM and statistical mean feature in the space domain. Future work will focus on two major activities: the exploitation of a physical model to simplify feature extraction during the calibration phase and the introduction of heterogeneous models for ensemble classification.

Author Contributions

Conceptualization, S.K., S.S., V.R. and M.N.; Methodology, S.K., S.S., V.R. and M.N.; Software, S.K., S.S.; Validation, S.K., S.S., V.R. and M.N.; Formal analysis, S.K., S.S., V.R. and M.N.; Investigation, S.K., S.S., V.R. and M.N.; Writing–original draft preparation, S.K., S.S., V.R. and M.N.; Writing–review and editing, S.K., S.S., V.R. and M.N.; Supervision, S.S. and V.R.; Funding acquisition, S.S. and V.R.

Funding

This research was partially funded by ERANET (H2020) CHIST-ERA III, RadioSense project.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Sigg, S.; Scholz, M.; Shi, S.; Ji, Y.; Beigl, M. RF-Sensing of Activities from Non-Cooperative Subjects in Device-Free Recognition Systems Using Ambient and Local Signals. IEEE Trans. Mob. Comput. 2014, 13, 907–920. [Google Scholar] [CrossRef]
  2. Witrisal, K.; Meissner, P.; Leitinger, E.; Shen, Y.; Gustafson, C.; Tufvesson, F.; Haneda, K.; Dardari, D.; Molisch, A.F.; Conti, A.; et al. High-Accuracy Localization for Assisted Living: 5G systems will turn multipath channels from foe to friend. IEEE Signal Process. Mag. 2016, 33, 59–70. [Google Scholar] [CrossRef]
  3. Kianoush, S.; Savazzi, S.; Vicentini, F.; Rampa, V.; Giussani, M. Device-free RF human body fall detection and localization in industrial workplaces. IEEE Internet Things J. 2016, 4, 351–362. [Google Scholar] [CrossRef]
  4. Bartoletti, S.; Conti, A.; Win, M.Z. Device-Free Counting via Wideband Signals. IEEE J. Sel. Areas Commun. 2017, 35, 1163–1174. [Google Scholar] [CrossRef]
  5. Wang, Y.; Wu, K.; Ni, L.M. Wifall: Device-free fall detection by wireless networks. IEEE Trans. Mob. Comput. 2016, 16, 581–594. [Google Scholar] [CrossRef]
  6. Fadhlullah, S.Y.; Ismail, W. A Statistical Approach in Designing an RF-Based Human Crowd Density Estimation System. Int. J. Distrib. Sens. Netw. 2016, 12, 8351017. [Google Scholar] [CrossRef]
  7. Xu, C.; Firner, B.; Moore, R.S.; Zhang, Y.; Trappe, W.; Howard, R.; Zhang, F.; An, N. SCPL: Indoor device-free multi-subject counting and localization using radio signal strength. In Proceedings of the IEEE International Conference on Information Processing in Sensor Networks (IPSN), Philadelphia, PA, USA, 8–11 April 2013. [Google Scholar]
  8. Cheng, Y.K.; Chang, R.Y. Device-Free Indoor People Counting Using Wi-Fi Channel State Information for Internet of Things. In Proceedings of the IEEE Global Communications Conference (GLOBECOM), Singapore, 4–8 December 2017. [Google Scholar]
  9. Choi, J.W.; Quan, X.; Cho, S.H. Bi-Directional Passing People Counting System Based on IR-UWB Radar Sensors. IEEE Internet Things J. 2017, 5, 512–522. [Google Scholar] [CrossRef]
  10. Zazo, R.; Nidadavolu, P.S.; Chen, N.; Gonzalez-Rodriguez, J.; Dehak, N. Age Estimation in Short Speech Utterances Based on LSTM Recurrent Neural Networks. IEEE Access 2018, 6, 22524–22530. [Google Scholar] [CrossRef]
  11. Maeda, K. Performance evaluation of object serialization libraries in XML, JSON and binary formats. In Proceedings of the Digital Information and Communication Technology and It’s Applications (DICTAP), Bangkok, Thailand, 16–18 May 2012; pp. 177–182. [Google Scholar]
  12. Cicerone, M.; Simeone, O.; Spagnolini, U. Channel Estimation for MIMO-OFDM Systems by Modal Analysis/Filtering. IEEE Trans. Commun. 2006, 54, 2062–2074. [Google Scholar] [CrossRef]
  13. ISO/IEC 20922:2016 Information Technology—Message Queuing Telemetry Transport (MQTT) v3.1.1; International Organization for Standardization, 2016; Available online: https://www.iso.org/standard/69466.html (accessed on 5 August 2019).
  14. Kianoush, S.; Raja, M.; Savazzi, S.; Sigg, S. A cloud-IoT platform for passive radio sensing: Challenges and application case studies. IEEE Internet Things J. 2018, 5, 3624–3636. [Google Scholar] [CrossRef]
  15. Barbarossa, S.; Sardellitti, S.; Di Lorenzo, P. Communicating while computing: Distributed mobile cloud computing over 5G heterogeneous networks. IEEE Signal Process. Mag. 2014, 31, 45–55. [Google Scholar] [CrossRef]
  16. Hong, M.; Razaviyayn, M.; Luo, Z.Q.; Pang, J.S. A Unified Algorithmic Framework for Block-Structured Optimization Involving Big Data: With applications in machine learning and signal processing. IEEE Signal Process. Mag. 2016, 33, 57–77. [Google Scholar] [CrossRef]
  17. Breiman, L. Bagging predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef] [Green Version]
  18. Islam, M.M.; Yao, X.; Nirjon, S.S.; Islam, M.A.; Murase, K. Bagging and Boosting Negatively Correlated Neural Networks. IEEE Trans. Syst. Man Cybern. 2008, 38, 771–874. [Google Scholar] [CrossRef] [PubMed]
  19. Schapire, R.E. The strength of weak learnability. Mach. Learn. 1990, 5, 197–227. [Google Scholar] [CrossRef] [Green Version]
  20. Depatla, S.; Muralidharan, A.; Mostofi, Y. Occupancy estimation using only wifi power measurements. IEEE J. Sel. Areas Commun. 2015, 33, 1381–1393. [Google Scholar] [CrossRef]
  21. Kianoush, S.; Rampa, V.; Savazzi, S.; Nicoli, M. Pre-deployment performance assessment of device-free radio localization systems. In Proceedings of the 2016 IEEE International Conference on Communications Workshops (ICC’16), Kuala Lumpur, Malaysia, 23–27 May 2016; pp. 1–6. [Google Scholar]
  22. Omar, H.A.; Abboud, K.; Cheng, N.; Malekshan, K.R.; Gamage, A.T.; Zhuang, W. A Survey on High Efficiency Wireless Local Area Networks: Next Generation WiFi. IEEE Commun. Surv. Tutor. 2016, 18, 2315–2344. [Google Scholar] [CrossRef]
  23. Bahai, A.R.S.; Saltzberg, B.R.; Ergen, M. Multi-Carrier Digital Communications: Theory and Applications of OFDM, 2nd ed.; Springer: New York, NY, USA, 2004. [Google Scholar] [CrossRef]
  24. Kianoush, S.; Savazzi, S.; Rampa, V. Tracking of frequency selectivity for device-free detection of multiple targets. In Proceedings of the 2017 IEEE International Conference on Communications (ICC’2017), Paris, France, 21–25 May 2017; pp. 1–6. [Google Scholar]
Figure 1. Crowd sensing using a WiFi network composed by N t transmitting and N r receiving MIMO WiFi devices. Each MIMO WiFi device is equipped with M antennas while M t = N t M and M r = N r M are the total number of transmitting and receiving antennas, respectively.
Figure 1. Crowd sensing using a WiFi network composed by N t transmitting and N r receiving MIMO WiFi devices. Each MIMO WiFi device is equipped with M antennas while M t = N t M and M r = N r M are the total number of transmitting and receiving antennas, respectively.
Sensors 19 03450 g001
Figure 2. Histogram p f , ( s ) of the CSI strength sample s 5 , 5 in dBm, for f = f 5 and = 5 , and the corresponding extracted features including mean [dBm], standard deviation [dB], skewness [dB 3 ], and kurtosis [dB 4 ] over time series. Measurements are obtained from the proposed system deployed in the layout shown in Figure 3 in the empty space case ( C = 0 ) and for the subject number C ranging from 1 up to 5.
Figure 2. Histogram p f , ( s ) of the CSI strength sample s 5 , 5 in dBm, for f = f 5 and = 5 , and the corresponding extracted features including mean [dBm], standard deviation [dB], skewness [dB 3 ], and kurtosis [dB 4 ] over time series. Measurements are obtained from the proposed system deployed in the layout shown in Figure 3 in the empty space case ( C = 0 ) and for the subject number C ranging from 1 up to 5.
Sensors 19 03450 g002
Figure 3. CSI histograms versus links p ( s ) with = 1 , , 27 , for the number of people ranging from C = 0 (empty environment) up to C = 5 evaluated by aggregating data over K = 30 sub-carriers.
Figure 3. CSI histograms versus links p ( s ) with = 1 , , 27 , for the number of people ranging from C = 0 (empty environment) up to C = 5 evaluated by aggregating data over K = 30 sub-carriers.
Sensors 19 03450 g003
Figure 4. CSI histograms versus frequency p f ( s ) with f = f 1 , , f 30 , for the number of people ranging from C = 0 (empty environment) to C = 5 evaluated by aggregating data over L = 27 links.
Figure 4. CSI histograms versus frequency p f ( s ) with f = f 1 , , f 30 , for the number of people ranging from C = 0 (empty environment) to C = 5 evaluated by aggregating data over L = 27 links.
Sensors 19 03450 g004
Figure 5. 2D histogram of CSI mean in space (i.e., μ t , f ) and frequency (i.e., μ t , ) domain for the empty space case and the subject number ranging from 1 up to 5.
Figure 5. 2D histogram of CSI mean in space (i.e., μ t , f ) and frequency (i.e., μ t , ) domain for the empty space case and the subject number ranging from 1 up to 5.
Sensors 19 03450 g005
Figure 6. Edge-cloud computing architecture for CSI data manipulation, feature processing and subject counting.
Figure 6. Edge-cloud computing architecture for CSI data manipulation, feature processing and subject counting.
Sensors 19 03450 g006
Figure 7. Block diagram of the ensemble classifier where the input data are the CSI samples. Multiple features, that are used as input to the model blocks, include mean μ , standard deviation σ , skewness ζ , and kurtosis κ computed in the frequency { μ , σ , ζ , κ } (6) and space { μ f , σ f , ζ f , κ f } (7) domains. LSTM and FF-NN models are used in model blocks φ j and an individual decision is made by each block. Ensemble classification C ^ is based on majority vote of individual decisions C ^ ( φ j ) .
Figure 7. Block diagram of the ensemble classifier where the input data are the CSI samples. Multiple features, that are used as input to the model blocks, include mean μ , standard deviation σ , skewness ζ , and kurtosis κ computed in the frequency { μ , σ , ζ , κ } (6) and space { μ f , σ f , ζ f , κ f } (7) domains. LSTM and FF-NN models are used in model blocks φ j and an individual decision is made by each block. Ensemble classification C ^ is based on majority vote of individual decisions C ^ ( φ j ) .
Sensors 19 03450 g007
Figure 8. Deployment layout for the experimental subject counting system including N t = 1 TX and N r = 9 RX devices with M = 3 antennas for each device. Distance between antennas at receiver is 6 cm, while transmitter antennas are spaced apart by 21 cm. The monitored area size is 6 × 4 m 2 . Up to N = 5 people enter this area and walk along the middle line thus blocking the LOS path between TX and RXs.
Figure 8. Deployment layout for the experimental subject counting system including N t = 1 TX and N r = 9 RX devices with M = 3 antennas for each device. Distance between antennas at receiver is 6 cm, while transmitter antennas are spaced apart by 21 cm. The monitored area size is 6 × 4 m 2 . Up to N = 5 people enter this area and walk along the middle line thus blocking the LOS path between TX and RXs.
Sensors 19 03450 g008
Figure 9. Confusion matrices for the FF-NN classifier for subject counting, C = 0 , 1 , 2 , . . . , 5 where C = 0 corresponds to the empty environment. Top figure: frequency domain CSI features: mean (left) and standard deviation (right). Bottom figure: space domain CSI features: mean (left) and standard deviation (right).
Figure 9. Confusion matrices for the FF-NN classifier for subject counting, C = 0 , 1 , 2 , . . . , 5 where C = 0 corresponds to the empty environment. Top figure: frequency domain CSI features: mean (left) and standard deviation (right). Bottom figure: space domain CSI features: mean (left) and standard deviation (right).
Sensors 19 03450 g009
Figure 10. Confusion matrices for the LSTM classifier for subject counting, C = 0 , 1 , 2 , . . . , 5 . Top figures: frequency domain CSI features: mean (left) and standard deviation (right). Bottom figures: space domain CSI features: mean (left) and standard deviation (right).
Figure 10. Confusion matrices for the LSTM classifier for subject counting, C = 0 , 1 , 2 , . . . , 5 . Top figures: frequency domain CSI features: mean (left) and standard deviation (right). Bottom figures: space domain CSI features: mean (left) and standard deviation (right).
Sensors 19 03450 g010
Table 1. Accuracy of subjects counting using maximum likelihood estimation based on normal modelling for the CSI distribution Pr [ s f , , t ] in space and frequency domains.
Table 1. Accuracy of subjects counting using maximum likelihood estimation based on normal modelling for the CSI distribution Pr [ s f , , t ] in space and frequency domains.
CEmpty12345
space domain0.4410.840.90.430.55
frequency domain10.90.920.310.540.3
Table 2. Accuracy of subjects counting using the CSI distribution features and the KL divergence approach in space (10) and frequency domains (9).
Table 2. Accuracy of subjects counting using the CSI distribution features and the KL divergence approach in space (10) and frequency domains (9).
C12345
space domain10.510.430.290.22
frequency domain0.930.50.40.390.28
Table 3. Accuracy of counting subjects using machine learning tools in space and frequency domains.
Table 3. Accuracy of counting subjects using machine learning tools in space and frequency domains.
FF-NN, Frequency Domain
CEmpty12345
kurtosis0.240.210.680.360.420.557
skewness0.450.9560.630.70.590.52
standard deviation0.680.780.840.670.580.76
mean0.730.90.940.790.670.84
FF-NN, Space Domain
CEmpty12345
kurtosis0.310.430.610.460.430.47
skewness0.290.460.540.360.540.62
standard deviation0.430.650.890.630.710.78
mean0.920.960.980.940.960.92
LSTM, Frequency Domain
CEmpty12345
kurtosis110.90.360.630.45
skewness10.910.90.270.45
standard deviation0.18110.810.720.9
mean0.720.4510.811
LSTM, Space Domain
CEmpty12345
kurtosis10.910.630.720.36
skewness1110.720.630.54
standard deviation0.811111
mean1110.911

Share and Cite

MDPI and ACS Style

Kianoush, S.; Savazzi, S.; Rampa, V.; Nicoli, M. People Counting by Dense WiFi MIMO Networks: Channel Features and Machine Learning Algorithms. Sensors 2019, 19, 3450. https://doi.org/10.3390/s19163450

AMA Style

Kianoush S, Savazzi S, Rampa V, Nicoli M. People Counting by Dense WiFi MIMO Networks: Channel Features and Machine Learning Algorithms. Sensors. 2019; 19(16):3450. https://doi.org/10.3390/s19163450

Chicago/Turabian Style

Kianoush, Sanaz, Stefano Savazzi, Vittorio Rampa, and Monica Nicoli. 2019. "People Counting by Dense WiFi MIMO Networks: Channel Features and Machine Learning Algorithms" Sensors 19, no. 16: 3450. https://doi.org/10.3390/s19163450

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop