High-Resolution Indoor Sensing Using Channel State Information of WiFi Networks

: Indoor sensing is becoming increasingly important over time as it can be effectively utilized in many applications from digital health care systems to indoor safety and security systems. In particular, implementing sensing operations using existing infrastructures improves our experience and well-being, and exhibits unique advantages. The physical layer channel state information for wireless ﬁdelity (WiFi) communications carries rich information about scatters in the propagation environment; hence, we exploited this information to enable detailed recognition of human behaviours in this study. Comprehensive calibration and ﬁltering techniques were developed to alleviate the redundant responses embedded in the channel state information (CSI) data due to static objects and accidental events. Accurate information on breathing rate, heartbeat and angle of arrival of the incoming signal at the receiver side was inferred from the available CSI data. The method and procedure developed can be extended for sensing or imaging the environment utilizing wireless communication networks.


Introduction
Combining wireless connectivity and robust sensing capability is expected to play a vital part in future wireless networks, e.g., beyond 5G (B5G) and 6G.It is envisaged that sensing is going to play a significant role in many future developments [1].Conventionally, sensing is realized by using dedicated hardware and antennas.However, it is increasingly necessary to share the same hardware with communication systems due to congestion in the frequency spectrum [2].Moreover, jointly designing sensing and communication systems or utilizing the same signal for both functions can substantially improve the energy efficiency of the system and reduce the amount of hardware utilized [3].
Over the past decade, health-related monitoring techniques of human bodies have undergone significant development.The vital-radio monitoring method was proposed by F. Adib et al. [4], in which respiration and heart rate information is acquired utilising a signal with a frequency between 5.46 GHz and 7.25 GHz.Specific hardware is required, resulting in high costs.The WiKiSpiro system proposed by P. Nguyen et al. can monitor respiration by fusing the sensing data obtained by a depth camera and a radar system [5].This system utilizes an artificial neural network (ANN) to establish the relationship between breathing and the movement of the chest and abdomen of the human body.It also analyzes whether the radar is accurately irradiating the chest and abdomen of the human body.However, the equipment used in the system is expensive, and the depth camera used is sensitive to light, which may lead to concerns about privacy protection.WiSpiro, proposed by X. Zhang et al., is a system based on frequency modulated continuous wave (FMCW) radar [6].The system analyzes phase changes caused by minor motions of the chest and abdomen on the continuous wave signal transmitted by the 2.4 GHz directional antenna via a phase demodulation procedure.Afterwards, it reconstructs the motion effect and links it to the breathing process by training the ANN.However, this system requires customized hardware and, due to limitations of the mechanical motion control system, it takes a long time to scan the subject.If the subject moves quickly during the scanning process, the monitoring accuracy will be degraded.Recent developments include utilizing millimetre wave radars [7].However, they have restricted applications that usually involve the detection and tracking of large moving objects at far distances, such as in traffic control or in consumer applications for interaction detection.The sensors must be fitted into the environment or their application carriers separately.They are costly and not energy efficient for general sensing purposes.Hence, sensing without using specific devices in ubiquitous wireless environments has attracted significant interest.In contrast to the above-mentioned radar systems, radio frequency identification (RFID) tags are used for sensing.They are very small and have a high recognition rate while requiring only rudimentary reading equipment, making them a viable option for vital sign monitoring [8][9][10].However, the security of RFID technology is suboptimal and its frequency range is not standardized across countries, resulting in limited portability.
In addition to RFID tags for sensing, other contact-based detection systems have also been developed, i.e., wearable devices, including wearable microphones.They gather data that can be directly or indirectly utilized to infer sleep quality.Subsequently, contactless sensing emerged.WiFi signals appear to be a more advantageous alternative as they are widely available without requiring additional hardware [11][12][13][14][15][16].In 2013, Mostafa Seifeldin et al. [17] proposed the first received signal strength indicator (RSSI)-based action recognition system, Nuzzer, which can detect whether people are moving but without providing specific details.In 2014, Neal Patwari explored the use of received signal strength (RSS) measurements between commercial wireless devices to infer the location of a person by exploiting information about the breathing response and breathing rate, estimating these for various situations [18].This approach has the advantage of being able to locate and monitor a person by his/her breathing response without the need for calibration.This is particularly important for applications in search and rescue, healthcare, and security.One challenge for this method is motion interference, where movements other than breathing (e.g., inhalation and exhalation) can lead to significant changes in RSS.Qifan Pu et al.
proposed WiSee [19], which utilizes the RSSI change in WiFi signals caused by human motion in the environment.It has the capacity to recognize nine different gestures with an average accuracy rate of 94%, even through walls.More advanced features of human activities can be assessed based on the RSSI of WiFi signals [20,21].Since RSSI information is coarse-grained, it is significantly affected by the complexity of the environment, including multipath propagation.Hence, the detection reliability is low, and minor human activities, such as breathing and heartbeat, are challenging to detect with high accuracy.
Detection of moving humans using WiFi CSI signals was reported in [22], which presented a novel scheme for device-free passive detection of moving humans with dynamic speed.However, the approach relies on a high level of accuracy of the CSI information, which cannot always be guaranteed for general WiFi devices.Mohammed Ibrahim implemented WiFi-based sensing within a vehicle to monitor the respiration rate of passengers in real-time [23].The CSI data for a subcarrier is randomly selected and smoothed using a Savitzky-Golay filter.Ultimately, the respiration rate is extracted via a peak detection methodology.However, the presence of outliers in the collected data was not accounted for, potentially compromising the accuracy of the determined respiration rate.Utilizing off-the-shelf equipment and a real-time processing system, Yu Gu successfully developed a prototype system capable of capturing the breathing patterns of individuals during sleep [24].Once the CSI raw data are obtained, the system identifies the best subcarrier by observing the maximum variance of each free line, and subsequently employs a Hampel filtering technique to effectively denoise the signal.
In a recent development in the sensing of human vital signs [25], researchers designed two processing modules for extracting vital signs based on WiFi CSI signals.The first module is a noise reduction module that employs principal component analysis (PCA) decomposition and low-pass filtering.The other module utilizes a power threshold for peak detection in conjunction with pass filtering.However, noise introduced by the hardware equipment was not adequately accounted for in the design of the noise reduction module.As a result, the resolution for sensing is limited.To assess the benefits of the proposed method, we performed a comprehensive analysis of recent studies in the field [26][27][28][29] and compared the key aspects of different approaches.
In this study, we used a large amount of data from indoor environments to train and verify the model using the measured WiFi CSI database publicly available from [30].The effectiveness of the proposed method was found to be equivalent to that of some micro-Doppler radars but without extra devices being involved.The results were analyzed in depth and were comparable to results obtained from real-world settings.The approach and procedure developed can be applied in general wireless networks to provide communication and radar functionalities simultaneously.

Data Processing and Method
The CSI data can be acquired from the physical layer information stored during WiFi communication.However, these raw data include responses from unexpected incidents, and noise from the system and environment.Hence, it requires a level of processing before making sensing decisions.Figure 1 presents the procedure for indoor sensing based on CSI information, which consists of three key processes: (i) acquisition of CSI data, (ii) data processing, and (iii) sensing indoor environment and angle of arrival (AoA) estimation.The obtained CSI data consist of amplitude and phase information at each frequency point of interest.The data processing module splits static factors from the environment, performs phase sanitization, eliminates outliers and performs data denoising.In the sensing indoor environment and AoA estimation processes, an instantaneous phase-based detection method is employed to infer the respiration rate from the processed CSI data.

Data Acquisition
The dataset we used comprised the results for five experiments performed by 30 different subjects in three different indoor environments.The experiments performed in the first two environments were of a line-of-sight (LOS) nature, while the experiments performed in the third environment were of a non-line-of-sight (NLOS) nature.The collected raw signals were stored in one main directory that contained three subdirectories.These subdirectories comprised the data that were recorded in the aforementioned three different environments.In each of these subdirectories, the data acquired for 10 different subjects were available.Each subject was involved in five experiment classes (falling from a sitting position, falling from a standing position, walking, sitting down and standing up, picking a pen up from the ground), consisting of 12 activities (sit still on a chair, fall down, lie down....) in these five experiment classes, each activity was repeat in the experiment 20 times.The data we used were trials 1 to 4 of the first subject's first experimental session (sit still on a chair) in environment 1.As shown in Table 1, we used the following files: E1_S01_C01_A01_T01, E1_S01_C01_A01_T02, E1_S01_C01_A01_T03 and E1_S01_C01_A01_T04.The CSI data were obtained from NIC with parameter control.A group of channel frequency responses (CFRs) of 30 subcarriers (where N = 30) in every packet were acquired, The CFRs were organized as, Each CSI represents the amplitude and phase of an orthogonal frequency division multiplexing (OFDM) subcarrier: where H( f k ) represents the frequency response of the k-th subcarrier at the frequency of f k , A k indicates the amplitude of the CSI, and j represents the phase of the CSI (for convenience, we use Θ k to specify the phase).To resolve the dynamic events from the environment, CSIs were acquired at four separate moments (at each moment, approximately 3 s of CSI data was recorded), under the same situation where one person was present inside the room for measurement.At each moment, approximately 1000 packets of CSI data were acquired and can be expressed as, where K is the total number of measurements for CFR at one moment.The acquired CFR data at four separate moments then serve as the basis for the sensing purpose.Before using the data to implement the algorithm and making any decision on detection, phase sanitization and outlier filtering processes are performed.Linear phase calibration (LPC) is a sanitization technique used to eliminate phase offsets in CSI caused by hardware limitations and multipath effects.Phase information is a critical component of CSI, as it contains vital information about the wireless channel, including the signal direction, the path location and the number of reflections, scattering path information, and channel attenuation.Inaccurate phase information can lead to incorrect channel analysis and interpretation, making it crucial to remove phase offsets to ensure accurate analysis and utilization of CSI data.LPC achieves this by adjusting the phase of each subcarrier to ensure that the CSI reflects the true state of the wireless channel.This calibration technique is essential for improving the accuracy and reliability of wireless communication systems that rely on CSI.
In experimental tests, the measured CSI phases are wrapped, i.e., they are in the range of [−π, π] and the true phase is an integer multiple of 2π away.This creates a lot of difficulties when using the phase information.Therefore, the first step in linearizing the phase is to unwrap the measured phase, which involves removing or adding integer multiples of 2π to the wrapped phase.The unwrapped phase θ n,k+1 is calculated through the following process: where θ n,k+1 represents the subcarrier phase with the serial number k + 1 obtained by removing the package at time n, and Φ n,k+1 represents the phase response measured with the serial number k + 1 at time n.△Φ n,k is defined according to the following formula: For CSI signals, the phase offsets caused by the carrier frequency offset (CFO) and the sampling frequency offset (SFO) during signal transmission are random due to different antennas on the same receiver sharing the same oscillator.These offsets can have varying effects at different subcarrier frequencies.The CFO arises from the difference in the oscillator frequencies or multipath propagation between the base station and the user, leading to a frequency offset between the transmitted and received signals.The SFO is produced from the clock difference between the base station and the user, resulting in a timing offset between the transmitted and received signals.After performing fast Fourier transform (FFT) on the received signal, the CSI signal can be obtained, and the phase difference of each subcarrier can be calculated in the frequency domain to obtain the estimated value of the CFO.The SFO can be estimated by comparing the sign difference between the received signal and the transmitted signal.To use the CSI for high-accuracy sensing, it is necessary to estimate and make corrections to the CFO and SFO.Failure to do so can cause signal distortion and affect signal decoding and analysis.Hence, the measured CSI phase can be expressed as: where θ n,k represents the estimated CSI phase of the k-th subcarrier at time n, I k is the subcarrier sequence index (ranging from −28 to 28 in IEEE 802.11n), M is the size of the FFT, ξ n is the time delay caused by the SFO, and β n is the unknown phase offset of the CFO.As I k is symmetric, in order to eliminate ξ n and β n , the key is to consider the phase of the entire bandwidth.We seek a and b to satisfy the following conditions.
Hence, a and b can be derived as The actual phase responses for all subcarriers, φ n , can then be obtained by removing the phase deviations caused by the combined effect from random noise during propagation.

Removing CSI Static Component
In practice, the CSI information is often subject to environmental factors that introduce static components in the channel responses-they are assumed to stay constant over the observation period.In contrast, the dynamic components of CSI change more rapidly.However, when the signal propagates in a complex environment with various clutters, the dynamic components tend to become less significant.To extract the dynamic components from CSI that are closely linked to human behavior in the environment for study, it is essential to eliminate the static components in CSI.
One effective method is to use the robust principal component analysis (RPCA) algorithm.It decomposes the raw CSI responses into a low-rank matrix representation, aiming to remove the influence of static components and to obtain dynamic CSI components that are closely related to human behaviors.This approach enables more accurate behavior recognition and tracking.
Assuming we have a matrix Y that can be expressed as a sum of a low-rank matrix L and a sparse matrix S, i.e., Y = L + S. Our objective is to obtain L and S, where L is a low-rank matrix and S is a sparse matrix.By leveraging the inherent nature of the original low-rank matrix L, we can recover the matrix through an optimization process.Furthermore, we assume that only a small fraction of the elements are corrupted, i.e., the noise is sparse but of arbitrary size.To address this issue, we can utilize the Lagrange multiplier method to convert it into an unconstrained optimization problem and find the optimal solution using an appropriate optimization algorithm.Conventionally, the matrix decomposition problem involves optimizing the objective function that includes the non-convex rank of the low-rank matrix L and the 0-norm of the noise matrix S, weighted by λ.As a result, this problem becomes non-deterministic polynomial-time hard (NP-hard) and requires relaxation for optimization.To overcome this challenge, one possible approach is to use alternative functions, such as the kernel norm of the matrix to approximate the rank of the matrix, and the 1-norm of the matrix to approximate the 0-norm.This transformation regime can convert the NP-hard problem into a convex optimization problem, leading to efficient computation yielding the optimal solution.min In this paper, we employ the augmented Lagrangian method (ALM) to solve the convex optimization problem.Specifically, we first construct the Lagrangian function: where G is the Lagrange multiplier, which is the inner product of G and Y − L − S. The optimization problem is turned into an unconstrained problem: where µ is a positive scalar, and λ and γ are non-negative Lagrange multipliers.The sparse matrix S can be obtained by min Then acquire the simplification function of the low-rank matrix L by, min To obtain estimates for the low-rank matrix L and the sparse matrix S, we repeat the optimization process described above until convergence is achieved.This iterative process involves updating L and S using the alternating direction method of multipliers (ADMM) until the discrepancy between the current and the previous estimates of L and S is sufficiently small.The specific steps of the RPCA algorithm for matrix decomposition are shown in Figure 2. Once the two processes converge, we obtain the estimates for the matrix L and matrix S. They represent the dynamic and static CSI components, respectively.

Eliminating Outliers
To improve the accuracy of motion detection from CSI measurements, it is essential to investigate outliers caused by protocol specifications and environmental noise.Outliers can have a significant impact on the accuracy of motion detection.The first step towards accurate motion detection is the identification and elimination of outliers.We employed the Hampel identifier , which distinguishes any values outside the specified interval [µ− γσ, µ + γσ] as outliers.Here, µ and σ represent the median and median absolute deviation (MAD) of the data series, respectively, and γ is a parameter dependent on the data distribution, typically set to 3. The removal of outliers results in more reliable outcomes from the measured data.Therefore, it is essential to perform outlier detection after the sampling and processing of the collected dataset.The outlier detection process involves calculating the median and MAD, determining the closed interval based on the target parameters, and identifying and eliminating outliers.After implementing this process, the obtained data are subject to further analysis and processing, enhancing the accuracy and reliability of motion detection.
Let O denote the set of outliers in the dataset.Each amplitude of the channel response, denoted as x i , is determined to be an outlier if it satisfies the following condition: The set O contains all the values that fall outside the interval [µ − γσ, µ + γσ], indicating the presence of outliers in the dataset.

Kalman-Filter-Based Noise Reduction
In practice, a Kalman filter is commonly employed to process noisy sensor data.By analyzing the combined responses, it can effectively remove noise from the sensor data and enhance the accuracy of the estimates.In applications for signal processing, a Kalman filter is widely utilized for tasks such as filtering, aiming for better prediction results.In this study, a Kalman filter was used to filter the noisy CSI data.The processed data were then used to derive respiratory-related outcomes.The signal-to-noise ratio increased due to this step; hence, a higher accuracy of prediction was achieved.
The Kalman filter incorporates prior knowledge of the system and iteratively renews the state and covariance of the target, providing an accurate estimate of the signal's state.To implement the Kalman filter, one can use the Kalman filter toolbox, which requires the input of several parameters, such as the state and observation equations and the state noise and observation noise covariance matrices, among others.With the input observations of each frame, the algorithm can generate the corresponding state estimation values.Due to its ability to handle noisy signals and accurately estimate the state of a dynamic system, the Kalman filter has become widely used in signal processing.In particular, the response due to respiratory movements is buried in the CSI data, and it is needed to yield a high signal-to-noise ratio.The Kalman filter consists of two steps: prediction and update.In the prediction step, the current state of the system is obtained from the previous moment through the state transition matrix A. The state covariance matrix P t is also obtained through the state covariance matrix at the previous moment P t−1 .The state transition matrix A and noise covariance matrix Q are then calculated.The state equation X t and the predicted state covariance matrix P t can be written as where Q is the noise covariance.In the update step, the gain K t for the Kalman filter is calculated with the state covariance matrix P t and the observation matrix H.The predicted state value X t ′ is then updated using the Kalman gain K t and the observation value Z t , leading to the updated state value X t .The updated state covariance matrix P t is then calculated using the Kalman gain K t , the observation matrix H, and the predicted state covariance matrix P ′ t .
where H is the observation matrix, and R is the observation noise covariance.The Kalman filter can track changes in the system over time and lay a foundation for making a highly accurate estimation of the system's state.

Methods
In order to provide an estimate over the indoor environment based on the processed data, three methods were adopted: the minimum entropy method for data selection; the instantaneous phase-based method for motion sensing; and MUSIC to estimate AoAs.

The Minimum Entropy Method
After processing the CSI data, the responses for the 30 subcarriers are analyzed to select the subcarrier waveform showing the best periodicity.In wireless communication systems, the minimum entropy method is a commonly used technique for selecting optimal subcarriers.This method is based on the concept of entropy, which measures the randomness or uncertainty of a signal.By minimizing the entropy of the received signal, the minimum entropy method selects the subcarrier that best matches the periodicity of the heartbeat and respiration signals that need to be measured.The minimum entropy method provides a simple and efficient way to select the best subcarriers in given channels.Each subcarrier is processed independently, and its entropy value is calculated using the FFT to obtain its frequency domain representation.The DC component is then removed, and normalization is performed.Finally, the entropy value of the subcarrier is calculated using the normalized frequency domain representation.This process is carried out using a loop operation, and the obtained entropy value is recorded.The entropy calculation formula used by the minimum entropy method can be expressed as: The normalized power of the ith subcarrier is denoted by p i , where n is the total number of subcarriers.Comparing the calculated entropy values, the subcarrier with the smallest entropy is selected as the best subcarrier for further processing.

Instantaneous Phase-Based Estimation Method
In the previous section, the best subcarrier signal in the time domain was processed to acquire its main frequency components, which incorporated the respiratory and heartbeat effects.Specifically, the respiratory signal was observed to have a frequency range of 0.1-0.5 Hz, which is notably lower than the heartbeat signal's frequency range of 1-1.67 Hz.To ensure accuracy of the estimation, the signal underwent bandpass filtering to remove frequency components outside the respiratory rate range.Following this, the Hilbert transform was applied to obtain an analytic signal, which contains both amplitude and phase information.The estimation of the respiratory signal involved calculating the phase difference between two consecutive time points of the instantaneous phase signal.This approach relies on the Hilbert transform's linear transformation capabilities, which enables the conversion of any real function into a complex function.Specifically, this transformation is represented by the following expression, where the imaginary unit j is introduced, H( f (t)) denotes the Hilbert transform of the input signal, f (t), and h(t) is designated as the analytic form of f (t), which contains the phase and amplitude information of f (t).The computation of the Hilbert transform is rooted in the Fourier transform, which allows for the calculation of the complex signals' phase and amplitude information with high accuracy.Specifically, the formula for the Hilbert transformation is expressed as where sgn(x) denotes the sign function of x, F(ω) denotes the Fourier transform of f (t), and F −1 represents the inverse Fourier transform.By unwrapping and differentiating the phase, the phase difference between adjacent time points can be accurately calculated; by then multiplying the phase difference by −1/(2π f s ) and taking the inverse, the respiratory rate can be eventually extracted.

Angle of Arrival Estimation
Multiple signal classification (MUSIC) is a widely used algorithm for AoA estimation in signal processing.The main objective of MUSIC is to estimate the angles of arrival of multiple signals from an array of sensors.At the receiver end, the CSI enables estimation of angle of arrival of an incoming signal, facilitating the localization of the signal source.The implementation of MUSIC involves multiple steps, i.e., data processing, spatial filtering, and signal spectrum estimation.
First, the CSI data are reconfigured into an M × N matrix by reorganizing the original CSI data (where M is the number of receiving antennas, and N is the number of frequency channels observed).At each channel sweeping moment, this is performed by constructing a column vector h of MN × 1 first, and subsequently rearranging it into a matrix C of M × N.
Next, with knowledge of the physical structure of the antenna array and the propagation characteristic, a spatial filter w(θ) is designed to add weights into the received signals, enabling enhancement of the signal power in the desired direction and suppression of interfering signals in other directions.Here, θ represents the direction angle of the signal arrival, and the expression of w(θ) is given by: where a(θ) is the spatial filter vector of the antenna array in the direction θ, which can be expressed as: The spatial filter is applied on the CSI matrix C to yield a weighted data matrix X(θ), where θ represents the direction angle of signal arrival.X(θ) is then expressed as: Finally, for each arrival angle θ, the signal spectrum S(θ) can be estimated by applying singular value decomposition (SVD) on the weighted data matrix X(θ), given as: where σ 1 denotes the largest singular value of X(θ).

Results and Discussion
This section mainly presents the results on sensing and estimation of AoA based on the CSI data from measurements.It also gives the intermediate data-processing outcomes of several techniques applied on the established CSI data.The techniques comprise phase calibration, low-rank decomposition, outlier removal, and Kalman filter.Finally, the results of respiratory sign prediction and estimation of AoA are given.

Calibration of CSI Phase
As shown in Figure 3, the phase distribution is presented for the case before and after calibration.The red points illustrate the phase distribution prior to calibration, while the blue points show the distribution following calibration.It is apparent that the red points are randomly scattered across the range between 0 and 360 degrees, whereas the blue points are concentrated within the 330 to 60 degree range.The aim was to eliminate any unknown time delays and phase disturbance.The results indicated that the original CSI responses were properly calibrated, and further processing could be carried out to derive the AoA of the impinging signal.

Low Rank Decomposition
To have clean CSI data, it is essential to eliminate environmental noise, which can significantly affect the accuracy and reliability of the data.As illustrated in Figure 4, the raw CSI data acquired from an antenna contained 900 data packets (each packet contained the response at one instant) and 30 subcarriers, with amplitudes in the range between 25 and 35 dB.However, due to the influence of environmental noise, the raw data appears jagged and erratic, which limits its usefulness for further analysis and interpretation.To address this issue, a denoising process is employed, which involves removing extraneous noise from the raw CSI data.As depicted in Figure 5, applying the RPCA algorithm based on the augmented Langrange multiplier method led to significant improvements in the quality and clarity of the CSI data.The denoised data were smoother and more regular, indicating the successful removal of environmental noise.However, despite the efficacy of the RPCA algorithm, there may still be other types of noise in the data that require further processing.These types of noise may include systematic errors, interference from other sources, or random fluctuations that are not adequately addressed by the denoising algorithm.

Outlier Removal
The presence of outliers in the CSI can potentially have a significant impact on the accuracy and reliability of sensing and estimation when using the raw data.In light of this, we employed a Hampel filter as a robust method for detecting and removing outliers from the CSI data.Figure 6 illustrates a set of raw CSI data in amplitude, which comprises 900 packets and 30 subcarriers.It demonstrates the presence of multiple outliers.However, after applying the sample filter method, the resulting data (Figure 7) exhibited a significant improvement.It is noteworthy that the Hampel filter method proved highly effective in removing most of the outliers from the CSI data, which significantly enhanced the quality of the data and improved the reliability of the subsequent processing and analysis.The results verified that the Hampel filter method can be used as a powerful tool for outlier detection and removal in CSI analysis.

Applying the Kalman Filter
After removal of outliers from the data, further processing is needed to cope with the noise possibly induced by measurement errors.This type of question can be dealt with using a Kalman filter.In Figure 8, we compared the channel responses with and without applying a Kalman filter to the third subcarrier on 4789 data packets.The blue curve depicts the CSI values before error correction, while the red curve represents the results after applying the Kalman filter.As can be seen from Figure 8, the Kalman filter effectively corrected the measurement errors included in the raw data, resulting in a smoother and more reliable curve.This step is crucial for subsequent data analysis to ensure the accuracy and reliability of the sensing and estimation.

Accurate Extraction of Respiratory Signs
In order to achieve a highest possible accuracy in acquiring the respiratory signal, a critical step is to pinpoint the subcarrier with the lowest entropy, denoting the least degree of disorder, and stripping it of its DC component.To accomplish this, we concurrently examined the data for all 30 subcarriers and singled out the subcarrier with the lowest entropy value.As demonstrated in Figure 9, the chosen subcarrier exhibited an amplitude ranging from approximately −1.5 to 2 dB.Meanwhile, we effectively eliminated the DC component in the channel response for the subcarrier.The selection of the subcarrier with the lowest entropy value renders the CSI data more dependable and accurate, while the removal of the DC component enhances the performance further.
Paket Index Amplitude (dB) By performing frequency domain analysis on the selected subcarrier, we can then identify the significant frequency components that it contains.As shown in Figure 10, the subcarrier (the index number 30) with the lowest entropy value contains several frequency components, including a 0.3 Hz component an amplitude of 0.23 dB, which corresponds to a breathing signal.In addition, a 1.2 Hz component with an amplitude of 0.15 dB was observed, which corresponds to the heartbeat signal.Some other auxiliary frequency components were likely caused by noise.
To accurately extract the respiratory signal, it is crucial to eliminate irrelevant frequencies from the original signal, as shown in Figure 10.To achieve this, we designed a fourth-order bandpass Butterworth filter with a pass band between 0.1 and 0.5 Hz, which aimed at filtering out components outside this band and to mainly preserve the respiratory signal.The signal in the time domain after applying the filter is given in Figure 11.The frequency domain analysis for this signal is described in the following section.We conducted an FFT operation on the filtered signal, as shown in Figure 11, to examine it more carefully in the frequency domain; the spectrum is illustrated in Figure 12.The results clearly indicated that the signal contains a dominant frequency component at 0.3 Hz.We assume that this is caused by a respiration process in the environment.
In order to test this assumption, we employed a method based on instantaneous phase to determine the respiratory rate over a longer period to display it in the time domain-this is shown in Figure 13.Four breathing cycles are clearly shown within a 15 s timeframe, with each breath cycle lasting an average of 2.1 s with an interval of 1.6 s between consecutive breaths.This finding confirmed the result from the frequency domain analysis that the sensed breathing rate was 0.3 Hz on average.

AoA Estimation
To accurately estimate the AoA of an incoming signal, reliable data at the receiver end is crucial.In this study, we utilized data from three receiving antennas in the experiment setting, each containing 30 CSI values at any measuring time instant, and collected a total of 900 data packets (900 time instants) for analysis.
Initially, the CSI data were reshaped to a 3-by-30 matrix (each CSI data packet contained channel responses received by three receiving antennas, and each antenna received responses for 30 channels) for AoA estimation, and subsequently were engaged in spatial filtering for each angle.The resulting spatial spectra were then evaluated to determine the AoAs, where the peak values indicated the estimated arrival angles.
Figure 14 displays the angular spectra of the received signals at a specific time instant before applying a phase calibration.It indicates that there were multiple signal sources within the environment, with the strongest signal source impinging from −49 degrees.The same process was implemented among the whole 900 data packets.The angles corresponding to the peak values in each case were found.The distribution of the arrival angles is depicted in Figure 15.The arrival angles were widely dispersed.This dispersion made it impossible to determine the precise AoA of the signals received.Hence, a phase calibration process was essential and this is introduced in the following section.A phase calibration step was applied on the CSI data before making the final estimation on the AoAs of the received signals.After this calibration process on the 900 data packets, the distribution of the AoAs was stabilised at −82 degrees; this is illustrated in Figure 16.However, there were 10 cases where the AoA predicted was at 85 degrees among all the 900 calculations.The actual angle of the transmitter was at −83.2 degrees to the receiver.This small margin reflected the effectiveness of the proposed method for the AoA estimation; however, an accurate estimation requires observation over a long period of time.
In conclusion, the phase-calibrated CSI data, in combination with spatial filtering and signal spectrum estimation, can be effectively utilized to estimate the AoAs of the received signals and ultimately achieve source positioning.The present study underscores the significance of utilizing reliable and calibrated CSI data for accurate and reliable signal analysis and positioning.0°4 5°9 0°9 0°4 5°F igure 16.AoA distribution determined from the 900 data packets after applying phase calibration over the CSI data.

Discussion
CSI data inevitably contain noise.In particular, for high-resolution sensing applications, phase change information is needed, but the raw phase information cannot be used directly.It is affected by many factors and has to be sanitized before making predictions.In this work, a comprehensive approach is proposed and we attempt to exploit it to handle CSI data which was acquired without a stringent requirement on its accuracy, hence being able to apply it to more general cases for high-performance sensing capability.
Similar to the approach suggested in [27], CSI data were utilized as the basis to retrieve information of sensing, and all subcarrier information was processed collectively.Before the determining step, optimal subcarrier data with the highest quality were chosen and exploited to extract the physiological information, thus ensuring a high level of accuracy for the sensing results.In the real-time sleep detection study [28], the peak-to-peak detection method was used to estimate the respiration rate, and in this study, the FFT-based estimation method was used to extract the respiratory rates.Peak-to-peak detection may not be able to capture small variations in respiration, and thus may not be as accurate as the methods exploiting CSI data.The results of our work were compared with other human physical information detection methods recently reported, as presented in Table 2.A clear advantage was demonstrated from this work as it used sophisticated data-processing methods to provide reliable results with high resolution, including for the AoA information.
Compared with other methods, such as that described in [31], where indoor positioning is realized based on RSS information, a system/method based on MDTW (multidimensional time warping) and WLS (weighted least squares) was used to classify WiFi signals and remove noise, with noise dealt with in a different way.MDTW and WLS have higher quality requirements for the input data, and noise or outliers may affect the accuracy of the results.The noise removal method used in this work was based on the RPCA algorithm and the least squares method can effectively separate the signal and noise components.The proposed algorithm is flexible-the parameters can be adjusted according to the need of the application to obtain the best denoising effect.The method in [32] was carried out in both single-room and multi-room settings, with an average prediction error of 2-3 and 3-4 m, respectively.In contrast, When the MUSIC algorithm was used to estimate AoAs, the range of errors was between 1 and 4 degrees.In [33,34], indoor positioning methods were developed with a known fingerprint dataset for identification from RSS information of WiFi signals.The prediction accuracy was improved but limited to a certain level that was not comparable to the results based on CSI data where phase information was included and processed.
Sensing methods based on CSI data demonstrated their benefits and were favored compared to approaches where RSS information was exploited, being conventionally adopted for certain applications, such as indoor positioning.In this work, CSI data that are publicly available were used and comprehensive data-processing methods were developed for health-related sensing purposes with high resolution, including the AoAs of the signals at the receiver end.It has a low requirement for the quality of the data for prediction.However, collecting the CSI data over a certain period of time is important to ensure the prediction results are of high resolution, in particular, for AoA estimation.

Figure 1 .
Figure 1.The system architecture for sensing, which consists of three main modules: acquisition of CSI data, data processing, and sensing indoor environment and AoA estimation.

Figure 2 .
Figure 2. Block diagram of the RPCA algorithm.

Figure 3 .
Figure 3. Raw phase responses (red) of 27000 (900 packets with 30 CSI measurements in each packet) CSI measurements, and phase response after linear calibration (blue).

Figure 5 .
Figure5.The same data as in Figure4after the RPCA algorithm was applied.

Figure 8 .
Figure 8. Kalman filter applied to amplitude of CSI data on the subcarrier 3.

Figure 9 .
Figure 9.The selected subcarrier with the lowest entropy, the subcarrier with the index number of 30 was chosen and the DC component in the frequency response of the subcarrier was removed.

Figure 12 .Figure 13 .
Figure 12.Spectrogram of the signal after applying the bandpass filter.

Figure 14 .
Figure 14.AOA estimation for one of the 900 data packets without phase calibration showing the resulting signal angular spectrum.

igure 15 .
AoA distribution determined from the 900 data packets before applying the phase calibration on the CSI data.

Table 1 .
CSI data files naming convention.

Table 2 .
Comparison of sensing capability with CSI data by recently developed methods.