Device-Free Indoor Activity Recognition System

: In this paper, we explore the properties of the Channel State Information (CSI) of WiFi signals and present a device-free indoor activity recognition system. Our proposed system uses only one ubiquitous router access point and a laptop as a detection point, while the user is free and neither needs to wear sensors nor carry devices. The proposed system recognizes six daily activities, such as walk, crawl, fall, stand, sit


Introduction
Human activity recognition (HAR) has increasingly attracted intense academic and industrial interest due to its various applications in real life such as elderly monitoring.In general, traditional HAR systems have been considered as device-based approaches such as vision-based [1], body-worn sensors [2], and smartphone interior sensors [3].However, both vision-based and sensor-based activity recognition systems have certain drawbacks.For instance, vision-based systems cannot work through walls and require good lighting conditions.Moreover, they invade human privacy.Sensor-based approaches are inappropriate for the user to carry a device that is sometimes easy to forget or inconvenient in some conditions.
With the rapid development of wireless communication, mobile computing and artificial intelligence, the common broadband wireless signal has been gradually extended from pure communication and networking to environment perception and localization services, forming a new research direction, namely: device-free wireless-based sensing technology [4].Unlike special complex ultra-wideband (UWB) wireless sensing systems or radar perception technologies [4,5], wireless-based sensing systems principally refer to the use of ordinary broadband wireless equipment (such as WLAN) to achieve the general context-aware technology.This technology is primarily called device-free sensing technology.It gains this label owing to the fact that perceived environments do not require to be supplied with sensor devices or cameras.Furthermore, the perceived objects do not require carrying wireless devices or wearing sensor devices.
Earlier, Youssef et al. [6] introduced the concept of device-free passive (DfP) localization, which enables locating and tracking entities that do not carry a device nor participate in the localization process.They leveraged the fluctuation of the received signal strength indicators (RSSI) as a metric for a human appearance in the observed area.Therefore, RSSI has been adopted in vast approaches such as localization [7,8], activity recognition [9,10], and gesture recognition [11].However, RSSI is an unstable metric due to the severe performance degradation of RSSI in complex environments because of multipath fading and temporal dynamics [4].
Recently, researchers have tried to leverage wireless physical layer channel state information (CSI) instead of wireless medium access layer received signal strength indicators (RSSI), since CSI achieves finer sensing granularity and can use information dimensions to obtain a qualitative improvement.Unlike RSSI, CSI is measured from radio links per orthogonal frequency-division multiplexing (OFDM) subcarriers of each received packet.Therefore, CSI has more information since RSSI value is only per received packet.Moreover, CSI is more robust to complex environments.Therefore, CSI helps in building a more sophisticated, more sensitive, and more robust sensing model.Such progress also makes more sophisticated sensing applications become more possible.Accordingly, in recent years, CSI has been leveraged in different device-free sensing approaches such as human localization [12,13], motion detection [14,15], people counting [16], activity recognition [17,18], and gesture recognition [19,20].
In this paper, we present a device-free human activity recognition system which leverages the fluctuation of CSI at the receiver end of the wireless system.The proposed human activity recognition method detects human activity without specific devices.The human activity recognition method proposed in this paper is motivated by a need for decreasing the installation complexity and development costs induced by the traditional device-based activity recognition systems.The system recognizes activities by analyzing the channel state information variation at the receiver end of wireless system.We exploit the fact that human motion in an indoor environment interferes with wireless signals and causes fading and shadowing effects.Thus, human motion creates perturbations in channel state information, and each human activity creates unique perturbations in the raw CSI.Our system can recognize several human activities such as walking, crawling standing, sitting, lying, falling, and empty (i.e., no human) in line-of-sight (LOS) and none-line-of-sight (NLOS) scenarios.Experimentally, we find that each activity has a unique affection on the raw CSI, and, thus, our proposed system can be used in monitoring people in the indoor environments, including detecting their activities.Unlike previous CSI-based human activity recognition approaches, our system detects human activity in LOS and NLOS with high and stable accuracy.
Our main contributions are as follows: • We present a device-free activity recognition system which enables us to classify several human activities in both LOS and NLOS scenarios.

•
We present a pattern segmentation algorithm based on local outlier factor (LOF) to detect the abnormal CSI segments that are caused by human activities; then, we extract useful features from both time domain and frequency domain of raw CSI.

•
We adopt sparse representation classification (SRC) algorithm to recognize the proposed activities, and our classification algorithm gains high-accuracy rates in LOS and NLOS.

•
We conduct exhaustive experiments with different users that provide us with important insights.Therefore, the system classification ability can be improved by choosing the best parameters gained by experiments.
The remainder of the paper is organized as follows.Section 2 presents methodology.Section 3 describes experiment setup and evaluation.Section 4 presents discussion.Section 5 concludes the paper and presents some suggestions for the future work.

Methodology
In this section, we describe the architecture and the main components of our system.Figure 1 shows the workflow of the proposed system.The proposed system extracts the amplitude information from WiFi signals, and then filters out the noise.Thereafter, the system selects the appropriate CSI streams that can reflect the fluctuation of CSI caused by human activities.From the selected streams, our system extracts the features of each activity in order to simplify the classification process.Finally, the system uses SRC to determine the type of the activity.In what follows, we describe the whole procedure in detail.

Preprocessing
We collect CSI from WiFi signals that emit from WiFi router (TP-LINK) which has two transmitter antennas, and the receiver is a Lenovo ThinkPad X201 which was equipped with Intel Wifi Link (IWL) 5300 Network Interface Controller (NIC) that has three receiver antennas (Intel Corporation, Santa Clara, CA, USA).The laptop is installed with Ubuntu version 14.04 and run the Open Source CSI-Tools [21].As is known in the OFDM system, we can get 2 × 3 the Multiple-Input and Multiple-Output (MIMO) system.Thus, we get six streams.In each stream, there are 30 subcarriers because 5300 NIC reports CSI 30 groups of subcarriers for each stream [21].Thus, the CSI we get can be represented as Equation ( 1): where H is the CSI, i and j are the streams and subcarriers, respectively.Each element in the CSI matrix in Equation ( 1) is expressed as the following: where H i,j ( f k ) is the amplitude, H i,j ( f k ) is the phase, and f k is the central frequency, where n is the electromagnetic noise from the adjoining devices.Here, we use only the amplitude information of CSI.
Figure 2 shows the 30 subcarriers that represent one CSI stream value of the walking activity.However, the real trend of CSI is drowned in the noise from the bordering environments.To remove the noise and obtain the actual trend and the useful information of CSI, we employ a weighted moving average (WMA).From Figure 2 (bottom), we see that the weighted moving average eliminates noise effectively and keeps the real trend of CSI. Figure 3 shows the whole streams that we got from a walking activity experiment.The values of subcarriers in each stream are different.Accordingly, to reduce the computing complexity, we aggregate all 30 subcarriers of each stream into one single value (stream vector) by calculating the average value of the 30 subcarriers of each stream as portrayed in Figure 4.As shown in Figure 4, the CSI streams have a distinctive sensitivity to the fluctuation of CSI due to human motion, so the values of the six streams are different.In Figure 4, streams 1 and 4 do not have enough sensitivity to the human activity, so, as shown in the figure, stream 1 and stream 4 are almost flat.Therefore, such streams must be eliminated.To this end, we use the bad stream elimination algorithm, which was described in the previous work [22], to eliminate bad streams.The bad stream elimination algorithm calculates the difference between the maximum peaks and valleys (Max-Min), the mean value (Means) and standard deviation (STD) as features in every spatial diversity.These three features are put into the Naive Bayesian classifier as vectors.Thereafter, the Naive Bayesian classifier is used to find and eliminate bad streams in case significant error happens when we choose bad streams as our source data.Thus, our system chooses only the best streams that can provide us useful information and also avoid false detection.

Pattern Segmentation
Anomaly detection is a very important issue in order to obtain the correct segment of the implemented activity.To this end, we present an efficient pattern segmentation method to expose the real segment of an anomaly CSI caused by a human activity.Markus et al. [23] presented the Local Outlier Factor (LOF) algorithm to detect anomaly segments by comparing the local density of a point to the local density of its k-nearest neighbors.Here, we use LOF to detect the anomaly changes in WiFi signals caused by human activities such as walk, sit, stand, crawl, fall, and lie.These activities have significant impact on the raw CSI.The fluctuation of CSI because of implementing such activities is very clear and differs from one activity to another.Local density can be estimated by calculating the reachability distance reach − d k (p, n), where k(p) is the set of k-nearest neighbors, and n is a neighbor point.The reachability distance can be computed as the following: whereas the local density Lrd of p is calculated as Equation (4): where the LOF of data point p is computed as Equation ( 5): Therefore, LOF is the ratio of the of average local densities of n's point neighbors to the local density of p.

Feature Extraction
The human motion in the test area is associated with time domain and frequency domain of raw CSI.In time domain, we choose three features as follows.Firstly, we calculate the maximum peak as the following: where CSI refers to the value of Channel State Information, and n refers to the sample in |W| window intervals.Then, we calculate the third central moment γ as delineated in Equation ( 7): where exp is the expectation value, and µ is the mean value of CSI amplitude which can be calculated as: where n = 1 is the start point of window W .Moreover, we calculate the root of the mean square (RMS) deviation of the CSI to the mean value µ, as the following expression: Furthermore, it is very important to analyze the frequency components to discriminate two different actions.The time domain features may not enough to achieve this goal.Hence, we choose two frequency features as follows.Firstly, we calculate the DC component a 0 .To calculate a 0 , we must apply Fast Fourier Transform (FFT) to all n samples in dynamic window W, so we calculate ith frequency component as the following expression: DC component is the average of FFT(i) and can be calculated as: The last feature is the entropy which can be calculated in chosen points in the n sample set, as explained in Equation ( 12): where P is the probability of each spectral FFT(i) band.

Activity Classification
We adopt Sparse Representation Classification (SRC) to discriminate the proposed human activity.Sparse representation classification (SRC) has been adopted in various signal and image processing tasks such as image classification and face recognition [24].Moreover, recently, SRC is used in device-free activity recognition [25].Wei et al. [25] relied on the superiority of SRC which is featureless, since SRC training can be built from CSI measurements without extracting features.However, this may lead to false detection specifically in the case of radio interferences or environmental changes.Unlike previous work, we build our classifier with several feature vectors extracted from training data.
As shown in Figure 1, the proposed system first preprocessed the training samples, and then computed the features mentioned above by a sliding window with a length of 5 s.Each piece of activity information is contained inside each sliding window.The extracted features over CSI streams within each window, forming feature vectors.Thereafter, such feature vectors are concatenated together to build the overcomplete dictionary.In the activity classification, the unknown CSI samples are preprocessed as the training samples, and then modeled as feature vectors.Lastly, SRC classifies the unknown activity based on the overcomplete dictionary that is constructed from training.
We model the dictionary as a concatenation of extracted and computed features for each activity class from training.Suppose that we have c activity classes and n i training samples from class i, where i ∈ [1, 2, ..., c].Then, n i training samples are arranged as columns of the data matrix M i = x i,1 , x i,2 , ..., x i,n i ∈ R dxn i , and d is the dimensions of feature vector.Any new test sample y ∈ R d for the same activity class, and can be represented as a linear combination of the learning samples in M i as: where coefficients a i,j ∈ R, j = 1, 2, ..., n i .Then, the concatenations of the training samples of all activity classes can be defined as A matrix as the following expression: The test y can be defined as: where a is the sparse coefficient vector, and the entries of a sparse coefficient vector are zero except for the entries that are associated with class i as explained in Equation ( 16): a = 0, ...0, a i,1 , a i,2 , ..., a i,n i , 0, ..., 0 T .( 16) Thereafter, we solve 1 minimization problem as: To discriminate an activity class, we compare ∧ a of each class with different activity classes that reproduce y as the following: where r i is the residual of class i.Then, the activity class can be determined as:

Experiments Setup
We use a TP-LINK (TL-WR845N) as an access point (AP), and Lenovo ThinkPad X201 with IWL 5300NIC (Intel Corporation, Santa Clara, CA, USA) and installed CSI-Tools [21] as a detection point (DP).The transmission rate is set to 100 HZ.We conducted extensive experiments in an apartment with two bedrooms and a living room with three volunteers in LOS and NLOS scenarios.
In the LOS scenario, access point (AP), detection point (DP), and the target user are placed in the living room.The AP and DP are fixed at two desks with 1 m height, separated by 4 m distance.The target user implemented the proposed activities in a fixed position between AP and DP.In the NLOS scenario, AP is fixed in the living room, where DP is fixed in the bedroom, separated by a six-inch hollow wall with a 4 m distance.The target user implemented the proposed activities in a fixed position in the bedroom in front of DP. Figure 5  We collect activity samples from each user individually.Table 1 shows the details of collected samples from three volunteer users during the experiments.The volunteer users are asked to implement the proposed activity in an apartment that has different electronic appliances and is full of furniture.We collect samples in two different scenarios as described in Table 1.In each scenario, we collect 450 samples of each activity from three users during nine sessions.In each session, 50 samples of each activity are collected from each user.Thus, in each session, a total of 150 samples of each activity are collected from three volunteer users.We train our classifier with 100 samples of each activity, and we test the accuracy of 50 samples.The system preprocesses each CSI collected for each activity and computes the selected features by a sliding window with a length of 5 s.

Results
We use the confusion matrix to evaluate our system.Thus, the accuracy can be computed as: where TP, TN, FP and FN are the true positive, true negative, false positive and false negative, respectively.The users are asked to implement the proposed six activities mentioned in Table 1 individually, while the last one (Empty) is the measurement of a test environment with no moving entities.Figure 6a shows the confusion matrix for the activity classification in LOS scenario.From the confusion matrix, we see that the average accuracy in LOS scenario is about 95.85%.Each activity has been classified accurately with high TP.Therefore, the accuracy is very high.Moreover, our system has gained high accuracy in NLOS of 92.57% as shown in Figure 2b.Researchers have used numerous classification algorithms to build activity classification models.Support Vector Machine (SVM) and k-Nearest Neighbor (kNN) are among the most popular classification algorithms.Thus, we compare our proposed SRC with SVM and KNN in both LOS and NLOS.However, experimentally, we find that SRC classification accuracy outperforms both SVM and KNN. Figure 7 shows the averaged accuracy of the three methods.

Discussion
As shown in the results section, the achieved results have gained a high accuracy rate in both LOS and NLOS scenarios.However, we also tried to test a single user activity with less training samples, we train the classifier for each individual user with 50 samples of each activity, and we test 100 samples of each activity.Figure 8 shows that the system still has stability and gained high accuracy, Thus, the proposed method is capable and may improve the human activity classification technology based only on wireless signals without special devices.Moreover, it is very important to demonstrate the system accuracy in case the testing samples do not overlap with training samples.We train the classifier with 100 samples of each activity collected from User 1 and User 2. We test 50 samples of each activity collected from User 3 during two sessions.In addition, we decrease the training samples collected from User 1 and User 2 to 50 samples of each activity, and we test 100 samples collected from User 3. As shown in Figure 9, the accuracy slightly decreased.By observing the results in Figures 8 and 9, we noticed that, in Figure 8, although the samples are 50, each user activity classification accuracy has gained higher accuracy rates because the classifier has been trained with samples that included samples from the target user.Whereas in Figure 9, the classifier has not been trained with samples collected from the target users.Thus, the accuracy slightly decreased.Overall results show that the system has a high accuracy rate under different conditions.
Furthermore, to approve the validity of our feature extraction method, we compare our feature-based classification methodology (SRC with feature extraction) with the featureless-SRC methodology.Wei et al. [25] proposed a device-free human activity system by leveraging the fluctuation of CSI, and they adopted SRC to classify several activities.Unlike our system, Wei et al. used SRC without the feature extraction method.They relied on the main superiority of SRC, which is featureless.They built training sets from CSI measurements without feature extraction.However, this methodology is not robust to environment changes or electromagnetic noise, and it leads to false detection.The result in Figure 10 shows that our system outperforms the SRC-featureless method.Thus, the feature extraction method promotes the classification accuracy and improved the whole sensing process.Furthermore, it should be noticed that in each evaluated session the proposed activities must be implemented in a fixed position to be correctly detected.Thus, CSI-based activity classification techniques can be considered as location-based because the CSI value is different in different positions.

Conclusions
In this paper, we present a device-free indoor activity recognition system which leverages the fluctuation of channel state information (CSI) of WiFi signals.We present an effective method to obtain the best information of raw CSI.We design an efficient pattern segmentation method to get the real segments of activities and choose the useful features from both time and frequency domains.Moreover, we present a fast classification algorithm which accurately recognizes several activities with high-accuracy rates in different scenarios.Experiment results manifest that our SRC algorithm outperforms SVM and kNN in classification accuracy.Moreover, results provide evidence that feature-based SRC outperforms featureless-SRC.
Device-free sensing technology is very immature and needs deeper investigation because the interference of moving objects leads to incorrect detection or false detection.Moreover, to detect and recognize different activities implemented simultaneously by multiple users is still a prominent challenge in current approaches.Thus, we plan to focus on such challenges in future work in order to build a more stable and reliable system that can be used in more complex scenarios.

Figure 5 .
Figure 5. Floor plan of the experiment environment.

Figure 6 .
Figure 6.Confusion matrices for the activities classification.(a) confusion matrix for Line-of-Sight (LOS); and (b) confusion matrix for Non-Line-of-Sight (NLOS).

Figure 8 .
Figure 8. Classification accuracy results with less training samples.(a) the results in LOS scenario; and (b) the results in NLOS scenario.

Figure 9 .
Figure 9. Classification accuracy results with 100 and 50 training samples.

Figure 10 .
Figure 10.SRC-classification accuracy with feature extraction and with featureless property.