MHHT-Based Method for Analysis of Micro-Doppler Signatures for Human Finer-Grained Activity Using Through-Wall SFCW Radar

Ultra-wideband radar-based penetrating detection and recognition of human activities has become a focus on remote sensing in various military applications in recent years, such as urban warfare, hostage rescue, and earthquake post-disaster rescue. However, an excellent micro-Doppler signature (MDS) extracting method of human motion with high time-frequency resolution, outstanding anti-interference ability, and extensive adaptability, which aims to provide favorable and more detailed features for human activity recognition and classification, especially in the non-free space detection environment, is in great urgency. To cope with the issue, a multiple Hilbert-Huang transform (MHHT) method is proposed for high-resolution time-frequency analysis of finer-grained human activity MDS hidden in ultra-wideband (UWB) radar echoes during the through-wall detection environment. Based on the improved HHT with effective intrinsic mode function (IMF) selection according to the cosine similarity (CS) principle, the improved HHT is applied to each channel signal in the effective channel scope of the UWB radar signal and then integrated along the range direction. The activities of swinging one or two arms while standing at a spot 3 m from a wall were used to validate the abilities of the proposed method for extracting and separating the MDS of different moving body structures with a high time-frequency resolution. Simultaneously, the corresponding relationship between the frequency components in MHHT-based spectra and structures of the moving human body was demonstrated according to the radar Doppler principle combined with the principle of human body kinematics. Moreover, six common finer-grained human activities and a piaffe at different ranges under the through-wall detection environment were exploited to confirm the adaptability of the novel method for different activities and pre-eminent anti-interference ability under a low signal-noise-clutter ratio (SNCR) environment, which is critical for remote sensing in various military application, such as urban warfare, hostage rescue, earthquake post-disaster rescue.


Introduction
Remote sensing for the recognition and classification of various human activities using radar has attracted great attention from researchers [1][2][3][4][5][6][7][8] since Victor Chen introduced micro-motion in radar observation [9][10][11], especially for finer-grained human activities (e.g., waving, jumping, picking up an object, standing with random micro-shaking, etc.).It has a critical and promising applicability in many fields, such as anti-terrorism, post-disaster search and rescue, border control, and patient monitoring in hospitals.
In addition to the constant Doppler frequency shift induced by the bulk translation of a moving target when electromagnetic waves transmitted by a radar system illuminate a target, the basic micro-motions, such as the oscillatory or rotational motion of the target or any structural components, will generate additional sideband frequencies around the main Doppler shift, known as micro-Doppler signatures (MDS) [11], which contain additional detailed characteristics related to these structures and serve as favorable characteristic signatures for identifying different human activities depicted by the distinct MDS.Consequently, an effective analysis and extraction method for MDS has gradually become the focus of many studies [3,12,13].Though some work on human activity classification has also been done in the millimeter-wave regime [1,14], the MDS information derived from the millimeter-wave echo is so poor, and the effective through-wall work range is limited, due to its weak anti-interference, and no distance resolution and structure of single-frequency continuous-wave.The emerging ultra-wide band (UWB) radar technology shows excellent performance for grasping the motion features of multiple scatters resulted from the movements of different human body segments because of its high distance resolution [15,16].All of the detailed MDS of different scatters are distributed in a range scope that has many range bins.From another perspective, different range bins can also be viewed as different channel signals.Therefore, the key to extract and analyze the MDS of human activity hidden in the UWB radar echo is, effectively, making full use of each valuable channel signal [17] and ensuring reasonable channel integration.Time-frequency (T-F) analysis is being adopted as the main tool to analyze the radar signals of moving targets because it presents a signal in a 2D feature space, which facilitates the visualization and interpretation of complex electromagnetic signal characteristics.Several studies on the MDS of moving targets have been conducted for analysis and recognition based on conventional T-F methods, and encouraging progress has been achieved.For example, the MDS features derived from short-time Fourier transform (STFT) spectra were exploited to analyze multifarious human motions and classify them based on certain pattern-recognition algorithms.These included the classification of seven activities (running, walking without a stick, walking while holding a stick, crawling, boxing while moving forward, boxing while standing at one position, and sitting) using a support vector machine (SVM) [5] or artificial neural network (ANN) [18], classification of six kinds of finer-grained human activities (piaffe, picking up an object, waving, jumping, standing with random micro-shaking, breathing while sitting) under a through-wall environment using an optimal self-adaption SVM [7], classification of five kinds of targets (human, dogs, car, bicycles, and vehicles) using an SVM [19], unarmed/armed personnel classification based on the Bayes classifier [20], and investigation of the MDS characteristics for 14 different human movements at stationary and forward-moving statuses under free or through-wall environments [1].However, these classifications rely excessively on the intelligent pattern recognition algorithm instead of the crucial basis features.The classification performance degrades significantly when the T-F analysis methods are slightly affected by external factors.Other conventional T-F analysis methods were also applied for depicting the MDS, such as the bilinear Winger-Ville distribution (WVD), log-Gabor filters [21], and the S-method [22].However, only relatively regular activities, such as swinging of arms and walking with or without arm movements, were analyzed in these studies.
In reality, for various random, finer-grained human activities, the body usually does not move at a constant amplitude and velocity, causing irregular, non-uniform, transient, and non-stationary signals [23].Hence, conventional STFT, which assumes linearity and stationarity in each window segment of the signal, is not applicable, and the mutual restriction between the time resolution and frequency resolution cannot be avoided [24].Although the wavelet transform (WT) has better time and frequency resolution, it is sensitive to the mother wavelet and it is inconvenient to alter [25].The bilinear WVD has a better time and frequency resolution in the spectrum than the linear distribution, but it suffers from the problem of cross-term interference, which is difficult to fix [24,26].In addition to these imperfections, the common T-F transforms, including the STFT, WT, WVD, and Choi-Williams distribution, require an a priori base function of the signal to obtain the best analysis performance [23,27], such as the cosine function in the STFT and the mother wavelet in the WT.For the various random human motion signals, the optimal a priori base function is difficult to determine, and it cannot adaptively change with the signal characteristics; thus, it usually cannot achieve the best analysis performance owing to various types of restrictions.Owing to these shortcomings, although these T-F transform methods have made some contribution to the analysis and feature extraction of extremely sketchy and general MDS of human motion, the T-F resolution, detail resolution, and anti-interference capability are very poor; hence, more detailed MDS of different limbs and torsos cannot be separated and displayed clearly in the 2D T-F spectrum [5,28].In addition, as our previous manuscript [7] and other studies [5,13] demonstrated, the MDS resulting from the common methods will attenuate sharply or become too weak and blurred under the through-wall or remote detection conditions.Consequently, the MDS of various components of motion will be intertwined or concealed in ineluctably strong noise and clutters; consequently, the corresponding instantaneous MDS features related to different target components will be difficult to extract and distinguish with the conventional method.
The Hilbert-Huang transform (HHT), first developed by Huang [27], provides a new approach that is applicable to various non-stationary signals.It does not rely on the a priori base function and offers a higher frequency resolution and more accurate timing of transient information [29].The HHT involves two steps of empirical mode decomposition (EMD) and the Hilbert spectrum (HS).The EMD decomposes a signal into a series of T-F components, adaptively, according to the characteristics of the signal itself, called the intrinsic mode function (IMF).Each IMF reflects the inherent oscillation characteristics of the original signal in different frequency scales.In addition, the HS displays the instantaneous frequency and amplitude of different IMFs on the two-dimensional (2D) T-F plane.Researchers have attempted to apply the EMD or HHT to the analysis or classification of human motion based on continuous-wave (CW) radar [13,30,31].Additionally, the conventional ensemble empirical mode decomposition (EEMD) was also applied to make analysis and extraction of MDS of moving human targets under through-wall detection environment [32,33].However, the results are unsatisfactory because the information from the CW radar signal is poor and there is no effective approach to denoising, causing serious mode fusion.Moreover, there is no effective self-adaptive method to eliminate the redundant IMFs in HHT, which is crucial for generating a high signal-noise-clutter ratio (SNCR).In addition, these papers just made signal decomposition in the time domain based on EMD or EEMD to provide weakly characteristic features as the base of human activities classification, but not make clear separation and theoretical demonstration of valuable MDS components resulting from different moving body parts, which is critical for accurate and efficient classification and recognition of different human activities.Though a rough framework of the novel time-frequency analysis method based on multiple HHT was proposed in last paper [34], not only is its performance in adaptability and anti-interference ability are unsatisfactory, but also the correspondence between different MDS components and various moving body parts are still illegible.
Therefore, this paper proposes an EEMD-based multiple HHT (MHHT) combined with the channel integration method to enable high-resolution T-F analysis of finer-grained human activity MDS hidden in UWB radar echoes in a through-wall detection environment.It first delimits the valuable channels of the UWB radar signal along the range axis and performs the EEMD on each channel signal.During the EEMD, for each valuable channel datum, the cosine similarity (CS) in vector space between the channel data and corresponding IMFs, is applied for valuable IMF selection, and the HS is applied to each selected IMF for spectrum analysis.Finally, the MHHT method accumulates the HS from all valuable channels of the UWB radar signal that contain different motion information distributed in different range positions for better visualization and convenient analysis.The experiments show that the proposed method not only has excellent ability to extract the different detailed MDS derived from distinct movements of human body parts but also displays outstanding anti-interference ability in a low-SNCR detection environment.All of these advantages of this novel method were testified and evaluated qualitatively and visually based on the experimental results.This paper is organized as follows: Section 2 describes the Materials and Methods, mainly including the UWB radar system for the collection of human motion data, the experimental setup, and the principle of the MHHT-based T-F analysis method.Based on the experimental results, various performance tests of the proposed method considering multiple aspects are presented in Section 3. Finally, a discussion and conclusion are presented in Sections 4 and 5, respectively.

UWB Radar System and Experimental Setup
The SFCW radar system developed by our group was the UWB radar system used for through-wall detection of finer-grained human activities in this study.It realizes the UWB by transmitting a series of discrete tones in a stepwise manner to cover the wide radar bandwidth in the time domain.Study shows that the SFCW have some technology advantages over the phase-modulated continuous-wave (PMCW) radar and frequency-modulated continuous-wave (FMCW) radar [35], including: (1) It can effectively inhibit the noise of the receiving signal by controlling the bandwidth (IF Bandwidth) of the receiver; and (2) the transmitting and receiving signals are both frequency domain signals, which can be convenient for applying the signal processing technology in the frequency domain for target echo processing and analysis.The schematic diagram of the SFCW system is illustrated by the Figure 1.The data shown in this paper are generated by the radar system with the following parameters: the 3.0 GHz bandwidth within a 0.5-3.5 GHz operating frequency band with a 30-MHz stepping frequency allows not only good penetration ability, but also excellent range resolution.A 4-ms pulse duration is used for sweeping the entire band.A 250-Hz pulse repetition frequency and 5-m maximum unambiguous range allow for acquisition of micro-motion information of the whole body within the unambiguous Doppler region.In addition, the transmitting and receiving antennas adopt cross-polarization, the maximum transmitting power is 10 dBm with a dynamic range >72 dB, and the sensitivity of the receiver is −90 dBm.The analog-to-digital conversion (ADC) accuracy is >12 bit, and the sampling rate is 250 Hz.The detailed working principle and formula derivation are presented in our previous paper [7].During the latter processing, we used only one channel signal derived from one of two receiving antennas, which has a better performance.
Remote Sens. 2017, 9, 260 4 of 16 various performance tests of the proposed method considering multiple aspects are presented in Section 3. Finally, a discussion and conclusion are presented in Sections 4 and 5, respectively.

UWB Radar System and Experimental Setup
The SFCW radar system developed by our group was the UWB radar system used for throughwall detection of finer-grained human activities in this study.It realizes the UWB by transmitting a series of discrete tones in a stepwise manner to cover the wide radar bandwidth in the time domain.Study shows that the SFCW have some technology advantages over the phase-modulated continuous-wave (PMCW) radar and frequency-modulated continuous-wave (FMCW) radar [35], including: (1) It can effectively inhibit the noise of the receiving signal by controlling the bandwidth (IF Bandwidth) of the receiver; and (2) the transmitting and receiving signals are both frequency domain signals, which can be convenient for applying the signal processing technology in the frequency domain for target echo processing and analysis.The schematic diagram of the SFCW system is illustrated by the Figure 1.The data shown in this paper are generated by the radar system with the following parameters: the 3.0 GHz bandwidth within a 0.5-3.5 GHz operating frequency band with a 30-MHz stepping frequency allows not only good penetration ability, but also excellent range resolution.A 4-ms pulse duration is used for sweeping the entire band.A 250-Hz pulse repetition frequency and 5-m maximum unambiguous range allow for acquisition of micro-motion information of the whole body within the unambiguous Doppler region.In addition, the transmitting and receiving antennas adopt cross-polarization, the maximum transmitting power is 10 dBm with a dynamic range >72 dB, and the sensitivity of the receiver is −90 dBm.The analog-to-digital conversion (ADC) accuracy is >12 bit, and the sampling rate is 250 Hz.The detailed working principle and formula derivation are presented in our previous paper [7].During the latter processing, we used only one channel signal derived from one of two receiving antennas, which has a better performance.The data presented in this paper are collected in our SFCW radar experiment shown in Figure 2a.During the experiment, a subject is positioned on one side of the laboratory brick wall (which is ~30 cm thick) and the SFCW radar system is set on the other side.Then, as illustrated in Figure 2a, the subject performs eight types of specific finer-grained human activities roughly on the spot, including swinging one arm while standing on the spot, swinging two arms while standing on the spot, The data presented in this paper are collected in our SFCW radar experiment shown in Figure 2a.During the experiment, a subject is positioned on one side of the laboratory brick wall (which is ~30 cm thick) and the SFCW radar system is set on the other side.Then, as illustrated in Figure 2a, the subject performs eight types of specific finer-grained human activities roughly on the spot, including swinging one arm while standing on the spot, swinging two arms while standing on the spot, performing a piaffe, picking up an object, waving, standing up, standing with micro-shaking, and breathing while sitting.

The EEMD-Based MHHT T-F Analysis Algorithm
In this section, we present a novel T-F method for MDS analysis for the UWB radar signals of finer-grained human activities.Based on the basic framework of the novel method proposed on the previous paper [34], we made some improvements.This mainly includes three steps: signal preprocessing, effective channel scope selection, and MHHT: EEMD-based multiple HHT (MHHT) combined with the channel integration method for high-resolution T-F analysis of finer-grained human activity MDS hidden in UWB radar echoes in a through-wall detection environment.Simultaneously, the effective IMF adaptive selection based on the CS standard was exploited in the EEMD.

Signal Preprocessing
Owing to the excellent range resolution of UWB radar, the echoes at different time delays, which are at different ranges from the radar, can be acquired and generated simultaneously.After amplification and sampling through the hardware circuit, the echoes are stored in a 2D data matrix : As the SFCW radar signal of finer-grained human motion shown in Figure 3a, the rows addressed as fast-time are associated with the range and the columns are associated with time.Each row contains one received waveform.
is the sampling number in the range and determines the detection range of the radar.
is the sampling number in the slow time that determines the total duration of the data together with the sampling frequency.As shown in Figure 3a, echoes detected by the UWB radar system are usually 2D ones that can be viewed as a set of time series in multiple channels, which can be referred to as a multichannel time series with channels of length .Before performing MHHT, the UWB echo data are preprocessed to eliminate stationary clutter and noise, such as background clutter and white Gaussian noise, as depicted in Figure 3a of the SFCW radar signal of piaffe activity 3 m behind the wall.Firstly, averaging and downsampling were performed for the data both in the range and time dimensions, resulting in not only the SNCR being improved, but also the data size being reduced.Secondly, a finite impulse response (FIR) motion filter is exploited for windowed mean subtraction in the time dimension, which aims to remove stationary background clutter resulting from scattering by rubbles and the human body.After the preprocessing, the UWB data are denoted as = ( ): = 1, … , , = 1, … .becomes × ( ≪ , < ).For the above SFCW radar signal, the background noise and strong direct wave were eliminated effectively after preprocessing, as shown in Figure 3b.

The EEMD-Based MHHT T-F Analysis Algorithm
In this section, we present a novel T-F method for MDS analysis for the UWB radar signals of finer-grained human activities.Based on the basic framework of the novel method proposed on the previous paper [34], we made some improvements.This mainly includes three steps: signal preprocessing, effective channel scope selection, and MHHT: EEMD-based multiple HHT (MHHT) combined with the channel integration method for high-resolution T-F analysis of finer-grained human activity MDS hidden in UWB radar echoes in a through-wall detection environment.Simultaneously, the effective IMF adaptive selection based on the CS standard was exploited in the EEMD.

Signal Preprocessing
Owing to the excellent range resolution of UWB radar, the echoes at different time delays, which are at different ranges from the radar, can be acquired and generated simultaneously.After amplification and sampling through the hardware circuit, the echoes are stored in a 2D data matrix R: As the SFCW radar signal of finer-grained human motion shown in Figure 3a, the rows addressed as fast-time are associated with the range and the columns are associated with time.Each row contains one received waveform.M is the sampling number in the range and determines the detection range of the radar.N is the sampling number in the slow time that determines the total duration of the data together with the sampling frequency.As shown in Figure 3a, echoes detected by the UWB radar system are usually 2D ones that can be viewed as a set of time series in multiple channels, which can be referred to as a multichannel time series with M channels of length N.
Before performing MHHT, the UWB echo data are preprocessed to eliminate stationary clutter and noise, such as background clutter and white Gaussian noise, as depicted in Figure 3a of the SFCW radar signal of piaffe activity 3 m behind the wall.Firstly, averaging and downsampling were performed for the data both in the range and time dimensions, resulting in not only the SNCR being improved, but also the data size being reduced.Secondly, a finite impulse response (FIR) motion filter is exploited for windowed mean subtraction in the time dimension, which aims to remove stationary background clutter resulting from scattering by rubbles and the human body.After the preprocessing, the UWB data are denoted as R = {r m (n) : m = 1, . . ., M, n = 1, . . .N}. R becomes M × N (M M, N < N).For the above SFCW radar signal, the background noise and strong direct wave were eliminated effectively after preprocessing, as shown in Figure 3b.

Effective Channel Scope Selection
During this stage, the channel scope of valuable data that characterize motion features is acquired from the preprocessed UWB data based on the energy principle.Firstly, the preprocessed empty mining signal along the channel direction is averaged to obtain the channel average energy , which represents the average level of background noise and clutter of UWB signals after preprocessing.The channel with the greatest energy will be located at the center of the motion scope along the range direction, referred to as , because most of the direct waves and the noise are removed after the preprocessing, and the strongest motion will result in the greatest energy.According to characteristic analysis of finer-grained human motion [7,36], the energy of the human motion radar signal will decay gradually from the center to both sides.By traversing from the center to both sides, taking two channels with energy values close to as the valuable channel boundaries, recorded as and , the number of valuable channel scopes is = − + 1.Then, the UWB data are denoted as follows: where is the m-th channel of the ′ channel radar profiles.Thus, becomes ′ × ′( ≪ , = ).

MHHT
Similar to the MHHT process illustrated in Figure 4, HHT is conducted on each effective channel datum to obtain Hilbert T-F spectra.As a result, the joint-time-channel-frequency representation (JTCFR) is acquired.Finally, based on the channel calculation operation of the JTCFR along the channel direction, the joint-time-frequency representation will be generated, which is intuitive and convenient for characteristic analysis and feature extraction.However, most importantly, during the HHT analysis for each effective channel datum, the EEMD is exploited to obtain signal adaptive decomposition according the characteristics of the signal itself and to avoid the mode-mixing problem.Moreover, the CS in vector space between each channel datum and its corresponding is applied to pick out the valuable , which will form the original time series for the sequential Hilbert spectrum analysis.

Effective Channel Scope Selection
During this stage, the channel scope of valuable data that characterize motion features is acquired from the preprocessed UWB data based on the energy principle.Firstly, the preprocessed empty mining signal along the channel direction is averaged to obtain the channel average energy E 0 , which represents the average level of background noise and clutter of UWB signals after preprocessing.The channel with the greatest energy will be located at the center of the motion scope along the range direction, referred to as d 0 , because most of the direct waves and the noise are removed after the preprocessing, and the strongest motion will result in the greatest energy.According to characteristic analysis of finer-grained human motion [7,36], the energy of the human motion radar signal will decay gradually from the center to both sides.By traversing from the center to both sides, taking two channels with energy values close to E 0 as the valuable channel boundaries, recorded as d c and d f , the number of valuable channel scopes is M = d f − d c + 1.Then, the UWB data are denoted as follows: where r m is the m-th channel of the M channel radar profiles.Thus, R becomes M × N (M M, N = N).

MHHT
Similar to the MHHT process illustrated in Figure 4, HHT is conducted on each effective channel datum to obtain Hilbert T-F spectra.As a result, the joint-time-channel-frequency representation (JTCFR) is acquired.Finally, based on the channel calculation operation of the JTCFR along the channel direction, the joint-time-frequency representation will be generated, which is intuitive and convenient for characteristic analysis and feature extraction.However, most importantly, during the HHT analysis for each effective channel datum, the EEMD is exploited to obtain signal adaptive decomposition according the characteristics of the signal itself and to avoid the mode-mixing problem.Moreover, the CS in vector space between each channel datum and its corresponding imfs is applied to pick out the valuable imfs, which will form the original time series for the sequential Hilbert spectrum analysis.
HHT analysis for each effective channel datum, the EEMD is exploited to obtain signal adaptive decomposition according the characteristics of the signal itself and to avoid the mode-mixing problem.Moreover, the CS in vector space between each channel datum and its corresponding is applied to pick out the valuable , which will form the original time series for the sequential Hilbert spectrum analysis.The following steps 1-3 will be executed on all of the M channel signals in a similar manner such that the next three steps will be introduced by taking one channel signal as an example for the analysis, referred to as r .
Adding some random white noise with appropriate intensity into r .
(a) Conduct the conventional EMD operation [37] on the added noise signal: Based on the local characteristics of the time series of the signal sequence, EMD decomposes the complex signal into a finite number of intrinsic mode functions.(b) Q imfs will be acquired after EMD operation and each imf component represents a single component signal at a certain frequency, as the following: Repeat the above steps (a) and (b) on r , but the white noise series added per time of decomposition is different.After decomposition for the total time of iterations of L, we obtain different corresponding IMFs as follows: LIMF = (IMF 1 , IMF 2 , . . . ,IMF l . . .IMF L ), where each IMF contains Q imfs with different frequency characteristics for each time of EMD operation.
Averaging the corresponding IMFs resulting from the L times of EMD and considering it as the final imf of r , we obtain the following: As is known, if the amplitude of the added white noise is much smaller compared to the original data, the extreme value of the original signal may not be changed, resulting in a failed solution of the model aliasing problem of the EMD.If the amplitude of the added white noise is too large compared to the original data, it will cause a significant amount of falsification in the process of decomposition.Moreover, too much decomposition will increase the computational time and computational burden.In this study, by conducting contrastive analysis of the intensity characteristics on the background signals and the measured signals of human activity, the intensity of white noise is set as 0.3 dBW.In addition, by observing the decomposition effect based on different iterations of decomposition and the corresponding computational time, we found the number of decomposition iterations L = 30 reached the demand.Even after the EEMD processing, there still exist parts of redundant components in imf s that decrease the signal-to-noise ratio of the signal and dim the motion feature in the later Hilbert spectrum.Therefore, conducting adaptive selection of valuable imf s is crucial.
The most direct and obvious signs of the human motion feature are the changes in the movement direction and phase changes at different times.Coincidentally, the CS in vector space is an effective parameter to evaluate the similarity of two vectors, which is not sensitive to the absolute value, but emphasizes on the difference between the two vectors.Hence, for the imf s resulting from the EEMD, by determining the corresponding CS between the imf s and r , and simply keeping the imf s with CS values over the threshold, the effective Q imf s representing the human motion feature can be selected as the follows: CS = S_cosθ 1 , S_cosθ 2 , . . ., S_cosθ q , . . ., S_cosθ Q In this study, the CS threshold is set to 0.3 by observing the relationship between the signal shape and CS value of numerous experiments.Moreover, according to the frequency analysis of numerous imf s with the corresponding CS less than 0.3, we found that the spectra of those imf s do not contain only chaotic frequencies and cannot display regular motion characteristics.After the selection of effective imf s, the final IMF is constructed as follows: The advantages of EEMD and effective selection of imf s are as follows: On the one hand, for each EMD decomposition, the added white noise is uniformly distributed in the entire time-frequency space.The different frequency scales of the signal are projected automatically to the corresponding frequency scales in the uniform time-frequency space established by the white noise.As the added white noise in each time of EMD decomposition is different, and they are not related to each other, the artificially-added noise will be offset after conducting overall averaging on the corresponding imf s derived from all of the EMD decompositions.Compared with EMD decomposition, EEMD can eliminate mode mixing, suppress the noise of the original signal, and obtain accurate imf s with clearer physical meaning.On the other hand, the effective selection of imf s eliminates redundant imf s without valuable motion features according the external shape features of motion signal and the internal time-frequency characteristics in the T-F spectrum.Therefore, these two steps can suppress and eliminate the noise and clutter, as well as improve SNCR, which helps in obtaining a clearer T-F distribution.3rd step: Hilbert transform.
The HHT approach defines the instantaneous frequency as the rate of change of phase from Hilbert transform.
The Q imf s in Equation (8) actually represents the Q time series, which forms the new signal Z(t).Then, the i −th imf can be represented by its analytic function Z i (t) as follows: where H{X(t)} denotes the Hilbert transform and: are the instantaneous amplitude and instantaneous frequency of the imf , respectively.Therefore, the signal Z(t) can be expressed as a linear combination of the real parts of Z i (t): This implies that amplitude is a function of time and frequency.The T-F distribution of amplitudes is designated as the Hilbert spectrum H(ω, t), and it can be contoured on the T-F plane.Thus far, we can obtain instantaneous Doppler frequencies from the Hilbert spectrum.For the radar data R{ r m (n)}, M corresponding to the T-F representations are generated by executing steps 2-4 on the M channels of the radar data, denoted as M H(ω, t): The joint-time-frequency-channel-representation (JTFCR) cube based on HHT will be acquired by collecting the Hilbert spectrum of each channel signal together, as shown in Figure 4.
In order to facilitate the observation and post feature extraction, the JTFCR is accumulated along the channel axis.Finally, a comprehensive time-frequency-representation is acquired, which contains rich feature information of human motion:

Micro-Doppler Analysis
The Doppler shift induced by a moving target at a constant velocity v facing the radar is given as: where f is the frequency of the carrier wave and c is the speed of light.For complex human motion, the micro-motion of different moving structural body parts will induce different micro-Doppler frequency components.Assuming that the number of structural body parts is N, and their corresponding velocity is v i (t)(i = 1, 2, . . .N), the resulting comprehensive Doppler effect is the sum of that of each moving body parts: However, in practice, as illustrated in Figure 5, the structural body parts are all three-dimensional with different lengths and thicknesses, such as the upper arm and lower arm in the action of swinging one arm.Therefore, the upper arm with larger scattering area will cause stronger MDS than the lower arm with smaller scattering area.In addition, these two body parts always perform vibrating motion, taking the shoulder as the origin.Each body parts with a certain of scattering area will contain numerous scattering points, which will cause different MDS.As depicted in Figure 5, setting the origin as O, the speed of scattering point S j positioned in the pendulum shaft length R j (t) of different structures is v j (t).In order to unify these two structures, we can express them as: where L OS represents the pendulum shaft length R j (t) from the origin O to the scattering point S j .L OA represents the length of the upper arm and L AB represents the length of the lower arm.The scattering points positioned at different pendulum shaft lengths will generate different speeds.Based on the principle that Doppler frequency is proportional to the speed, frequencies from the lower arm are much higher than those from the upper arm.Moreover, the frequency range of the lower arm is also higher than that of the upper arm due to the physical structure that the lower arm is longer than the upper arm: scattering area will contain numerous scattering points, which will cause different MDS.As depicted in Figure 5, setting the origin as , the speed of scattering point positioned in the pendulum shaft length ( ) of different structures is ( ).In order to unify these two structures, we can express them as: where represents the pendulum shaft length ( ) from the origin to the scattering point .represents the length of the upper arm and represents the length of the lower arm.The scattering points positioned at different pendulum shaft lengths will generate different speeds.Based on the principle that Doppler frequency is proportional to the speed, frequencies from the lower arm are much higher than those from the upper arm.Moreover, the frequency range of the lower arm is also higher than that of the upper arm due to the physical structure that the lower arm is longer than the upper arm: > .

Experimental Results
This section mainly utilizes the novel method to analyze and extract micro-Doppler signatures of human activities hidden in the UWB radar signal under the through-wall environment, with the aim to investigate and demonstrate the advantages of the novel method, including the T-F resolution, broad applicability for different human activities, and anti-interference ability of the strong noise and clutter arising from penetration from the wall or increase of the detection range.Moreover, we employ the comprehensive T-F analysis method [7] based on STFT (0.42 s Hanning window), which is considered as the most commonly used and stable method, as the reference method.

Micro-Doppler Feature Analysis Based on MHHT and Validation of Structural Characteristics
For any human activity, both the multiple specific frequency components generated by the motion of different body parts, and the large-scale uncertain frequency components arising from the random noise and clutter, will embed in the UWB radar echo.Only the T-F analysis method with excellent T-F resolution could extract detailed MDS and make them more visible and dissociative.During this experiment, the proposed method and reference method are performed separately on the activities of swinging one arm (right arm) or swinging two arms while standing on the spot behind the wall for MDS analysis, and the spectra are as shown in Figure 6.In addition, the corresponding relationship between the frequency components and structures of the moving human body were also analyzed according to the radar Doppler principle combined with the principle of human body kinematics.According to the high time-frequency resolution and the ability to extract the detailed MDs of various moving body components, the advantages of the novel method were evaluated qualitatively and visually.
the wall for MDS analysis, and the spectra are as shown in Figure 6.In addition, the corresponding relationship between the frequency components and structures of the moving human body were also analyzed according to the radar Doppler principle combined with the principle of human body kinematics.According to the high time-frequency resolution and the ability to extract the detailed MDs of various moving body components, the advantages of the novel method were evaluated qualitatively and visually.For the activity of swing one arm, compared to the spectrum based on the reference method using the proposed method shown in Figure 6a,b, we can determine: (1) The frequency range with valuable motion characteristics in the range of 0-60 Hz in the two spectra.(2) The specific corresponding frequencies related to the motion of different body parts cannot be separated effectively based on the reference method shown in Figure 6a; however, they exhibit remarkable region segmentation in the T-F domain in the spectrum based on the proposed method shown in For the activity of swing one arm, compared to the spectrum based on the reference method using the proposed method shown in Figure 6a,b, we can determine: (1) The frequency range with valuable motion characteristics in the range of 0-60 Hz in the two spectra.(2) The specific corresponding frequencies related to the motion of different body parts cannot be separated effectively based on the reference method shown in Figure 6a; however, they exhibit remarkable region segmentation in the T-F domain in the spectrum based on the proposed method shown in Figure 6b.(3) In the high frequency parts ranging from 30-60 Hz, the frequency components characterizing motion features were contaminated by the noise and clutter severely for the reference method, while the proposed method shows excellent performance.(4) We can observe the instantaneous frequency characteristic changing with time in the results of the proposed method but not in the reference method.
As illustrated by the high time-frequency resolution spectrum shown in Figure 6b, we can perform a detailed corresponding relationship analysis between the frequency components and structures of the moving human body based on the human body kinematics of swinging one arm, which mainly includes four structures (lower arm, upper arm, shoulder and the adjacent chest, and the torso) with different motion speeds from high to low, as shown in Figure 5.The analysis process are as follows: (1) The average motion cycle is about 1.5 s in the experiment and the arm swings forward and backward one time in turn for one cycle.Thus, as shown in Figure 6b, we can find two frequency peaks during 1.5 s in the spectrum.(2) Based on the principle that the Doppler frequency is proportional to the speed, the lower arm deserves the highest and widest frequency band in the range of 28-60 Hz because it is the longest moving cylindrical component in this activity causing the largest range of velocity change.However, its strength is the weakest, owing to its smallest scattering area.
(3) The second-highest frequency band ranging from 13-28 Hz originates from the motion of the upper arm.(4) The third-highest frequency components ranging from 5-10 Hz, marked by the yellow curve, result from the motion of the shoulder and the adjacent chest.(5) The frequency component of the torso, marked by the black curve, is the smallest but strongest, centered at 2 Hz, which is caused by the micro-rock-back-and-forth motion driven by the swinging arm, but with the widest scattering area.
Comparing the results based on the two methods of swinging only the right arm or swinging both of the arms, the reference method cannot distinguish and display the respective characteristics between the left and right arms, as shown in Figure 6c.Nevertheless, not only the frequencies of different structures, but also their affiliation to responding arms, could be distinguished, as shown in Figure 6d, such as the different phase information in the left and right arms marked by white and red curves, respectively, and in the left and right shoulders, marked by yellow and purple curves, respectively.

Adaptability Test
To demonstrate the wide adaptability of this proposed method to different finer-grained human activities, another six common finer-grained human activities were exploited to conduct T-F analysis and their corresponding T-F spectra are shown in Figure 7.
Remote Sens. 2017, 9, 260 12 of 16 off, both the arms swinging during landing, and the after-swing to restore balance after landing, all will cause significant MDS, as shown in Figure 7d.(5) Standing activity with random micro-shaking generates a very simple and low frequency component ranging from 0-7 Hz, as shown in Figure 7e, resulting from the free and slow torso shaking.In addition, the small and weak high frequency components above the torso are derived from the micro-swing of the two arms driven by the torso motion.( 6) There is only one frequency component positioned at 0.2 Hz, as shown in Figure 7f, and it accurately depicts the frequency characteristic of the activity of breathing while sitting and extracts the weak breathing periodic motion characteristics.

Anti-Interference Ability Test
The anti-interference ability of noise and clutter is a critical index to evaluate the stability and environmental adaptability for an algorithm.This ability significantly affects the validity of the algorithm while analyzing and extracting the MDS of human motion under an imperfect detecting environment with considerable background noise and clutter, such as in through-wall or long distance detection.In this study, in order to simulate the strong interference environment in as Just as these spectra show in Figure 7, we find that this proposed method could approach favorable analysis performance for different activities and extract the detailed MDS of different moving body components and the spectra of different activities display remarkably distinguishable MDS: (1) For the spectrum of the piaffe shown in Figure 7a, in addition to the MDS frequency components similar to those of the activity of swinging two arms shown in Figure 6d, including two arms, shoulders, and torso, there are two additional strong frequencies positioned at approximately 13 Hz with a fixed phase difference.Therefore, they must be related to the only added moving body structure of the two upper legs as compared with the two swinging arms, which move with a higher speed than the shoulders and have a relatively wide scattering area.(2) The activity of picking up mainly includes two steps of bending down and standing up; therefore, the upper torso and head will cause the primary MDS.This activity usually consumes more time, and the speeds of various body parts are low.Therefore, as shown in Figure 7b, the high-frequency range of 15-30 Hz arises from the head motion while the strong and the low-frequency range of 2-15 Hz mainly arises from the torso motion.(3) The frequency components of waving are similar to those of swinging one arm, as shown in Figure 7c, but each part is lower because the speeds of the corresponding body structures are lower than those in swinging one arm.(4) The spectrum of jumping up contains the most complex frequency components, and the highest frequency can be as high as 60 Hz due to the high-speed motion.Moreover, in addition to the simultaneous swinging of two arms to generate power for take-off, both the arms swinging during landing, and the after-swing to restore balance after landing, all will cause significant MDS, as shown in Figure 7d.(5) Standing activity with random micro-shaking generates a very simple and low frequency component ranging from 0-7 Hz, as shown in Figure 7e, resulting from the free and slow torso shaking.In addition, the small and weak high frequency components above the torso are derived from the micro-swing of the two arms driven by the torso motion.( 6) There is only one frequency component positioned at 0.2 Hz, as shown in Figure 7f, and it accurately depicts the frequency characteristic of the activity of breathing while sitting and extracts the weak breathing periodic motion characteristics.

Anti-Interference Ability Test
The anti-interference ability of noise and clutter is a critical index to evaluate the stability and environmental adaptability for an algorithm.This ability significantly affects the validity of the algorithm while analyzing and extracting the MDS of human motion under an imperfect detecting environment with considerable background noise and clutter, such as in through-wall or long distance detection.In this study, in order to simulate the strong interference environment in as realistic a manner as possible, we exploit the piaffe activity radar signal under different through-wall distances (4, 5, 6 m) to conduct the ability test.Similarly, the STFT is also taken as the reference method.
According to the spectra based on the two methods of the piaffe shown in Figure 8, the MDS of the moving human body parts fade gradually and inevitably with the increasing penetration range.However, for the STFT-based spectrum shown in Figure 8a,c,e, the high frequency components of human motion weaken significantly with increasing range, and regular motion features are also rarely observed and are simply filled with rambling background noise in the spectrum at 6 m.On the contrary, although with slight attenuation, the proposed method can still extract and distinguish the human body MDS components from the complex and unfavorable background with a high degree of separation.Moreover, the movement rhythm and motion characteristics could be observed clearly from the spectrum of the proposed method.
human motion weaken significantly with increasing range, and regular motion features are also rarely observed and are simply filled with rambling background noise in the spectrum at 6 m.On the contrary, although with slight attenuation, the proposed method can still extract and distinguish the human body MDS components from the complex and unfavorable background with a high degree of separation.Moreover, the movement rhythm and motion characteristics could be observed clearly from the spectrum of the proposed method.

Discussion
In this paper, eight kinds of finer-grained human activities were detected by the SFCW radar and analyzed by the MHHT-based T-F transform method for extracting detailed MDS.For the activities of swing one arm and two arms, as shown in Figure 6, the proposed method could extract the different frequency components of body parts effectively and display them in the T-F spectrum clearly.Compared with the T-F analysis in human activities classification research [5,20], the proposed method took a very large step in the T-F resolution of the human motion spectrum, which is critical for obtaining detailed information and grasping the instantaneous motion characteristics.Moreover, compared with MDS analysis results of human activities using HHT based on EEMD [32], this paper determined and demonstrated the corresponding relationships between the frequency components and corresponding body structures, which will provide more accurate and representative motion features for better classification.What is more outstanding, even the time delay of the MDS changes caused by moving left and right arms could be observed.In the aspect of adaptability, the proposed method is applicable to various non-stationary movements shown in Figure 7, not just to some stationary and regular actions, such as walking with or without arm swing [20][21][22].More importantly, compared with the weak performance of the conventional T-F analysis method that the MDS will be inundated by the clutter and noise easily [5], as shown in Figure 8, the proposed method could carry out the above two advantages even under a through-wall and remote detection environment.In other words, the proposed method results in a higher signal-noise-clutter ratio to remove the noise and clutter and can improve the interpretation.Therefore, based on its excellent anti-interference ability, the proposed method could provide more detailed and accurate motion feature information under a poor environment, which is highly favorable for the actual application of through-wall detection and motion classification.
However, the energy of the MDS inevitably attenuates with the increase of the through-wall detection range, as shown in Figure 8.We still need to find a reasonable and effective approach for feature enhancement to solve the attenuation problem.Moreover, the discussion of the results is largely qualitative and mostly made visually.Some effective technologies and indexing should be exploited to quantitatively verify the performance of the novel method in the further study.In addition, as shown in all of the MHHT-based spectra, the frequency bands resulted from the motion of various body structures are composed of discrete points and are also missing in some time periods.Therefore, the extraction, segmentation, and quantification of different MDS components in the T-F spectrum derived from the motion of various body structures will be a significantly challenging task in our future work.

Conclusions
This paper proposed a novel T-F analysis method named EEMD-based MHHT combined with the channel integration method for high T-F resolution analysis of MDS from finer-grained human activities hidden in the UWB radar echo in a through-wall detection environment, which is critical for remote sensing in various military application, such as urban warfare, hostage rescue, and earthquake post-disaster rescue.
During the experiments, first, the activities of swinging one or two arms while standing on a spot 3 m behind the wall were analyzed based on the STFT-based and MHHT-based methods.While the proposed method showed excellent extracting and separating abilities of the MDS of the different moving body structures with higher T-F resolution compare to the reference method, the corresponding relationship between the frequency components in the T-F spectra and structures of the moving human body were also demonstrated according to the Doppler radar principle combined with the principles of human body kinematics.For the same activity, the proposed method also showed an excellent ability to grasp instantaneous characteristics.Moreover, six common finer-grained human activities were applied to test the adaptability of the proposed method for different activities, and the results obtained in the test were in perfect agreement.Finally, the piaffe at different ranges under the through-wall detection environment was utilized to simulate different SNCR application environments.Compared with the reference method, the results of the spectrum based on the proposed method still display an outstanding ability to extract MDS even under a severely affected detection environment.The high SNCR of the spectra of the proposed method improves the interpretation accuracy of human motion and proves its capability to remove noise, which will provide critical advantages in practical application.Based on these advantages and characteristics, this novel approach could undoubtedly provide more detailed and accurate feature information of human motion as the foundation for a pattern recognition device during activity recognition and classification, even under a poor detection environment, with a considerable amount of noise and clutter.

16 Figure 3 .
Figure 3. SFCW radar signal of a piaffe at a position 3 m behind the wall.(a) Original signal.(b) Preprocessed signal.

Figure 3 .
Figure 3. SFCW radar signal of a piaffe at a position 3 m behind the wall.(a) Original signal.(b) Preprocessed signal.

2nd step:
Effective selection of the imf s.

4th step:
Multiple channel accumulation of T-F spectrum.

Figure 5 .
Figure 5. Geometry of the radar and moving human body structures.Figure 5. Geometry of the radar and moving human body structures.

Figure 5 .
Figure 5. Geometry of the radar and moving human body structures.Figure 5. Geometry of the radar and moving human body structures.

Figure 6 .
Figure 6.Spectra based on the two methods of the subject swinging one arm or two arms while standing on a spot 3 m behind the wall: (a) STFT-based spectrum of the activity of swinging the right arm.(b) MHHT-based spectrum of the activity of swinging the right arm.(c) STFT-based spectrum of the activity of swinging both arms.(d) MHHT-based spectrum of the activity of swinging both arms

Figure 6 .
Figure 6.Spectra based on the two methods of the subject swinging one arm or two arms while standing on a spot 3 m behind the wall: (a) STFT-based spectrum of the activity of swinging the right arm.(b) MHHT-based spectrum of the activity of swinging the right arm.(c) STFT-based spectrum of the activity of swinging both arms.(d) MHHT-based spectrum of the activity of swinging both arms.

Figure 7 .
Figure 7. MHHT-based T-F Spectra of a subject performing six kinds of finer-grained human activities while staying at a position 3 m behind the wall: (a) piaffe; (b) picking up an object; (c) waving; (d) jumping up; (e) standing with random micro-shaking; and (f) breathing while sitting

Figure 7 .
Figure 7. MHHT-based T-F Spectra of a subject performing six kinds of finer-grained human activities while staying at a position 3 m behind the wall: (a) piaffe; (b) picking up an object; (c) waving; (d) jumping up; (e) standing with random micro-shaking; and (f) breathing while sitting.