Linear Discriminant Analysis-Based Motion Classification Using Distributed Micro-Doppler Radars with Limited Backhaul

In this paper, we propose a cooperative linear discriminant analysis (LDA)-based motion classification algorithm for distributed micro-Doppler (MD) radars which are connected to a data fusion center through the limited backhaul. Due to the limited backhaul, each radar cannot report the high-dimensional data of a multi-aspect angle MD signature to the fusion center. Instead, at each radar, the dimensionality of the MD signature is reduced by using the LDA algorithm and the dimensionally-reduced MD signature can be collected at the data fusion center. To further reduce the burden of backhaul, we also propose the softmax processing method in which the distances of the sensed MD signatures from the centers of clusters for all motion candidates are computed at each radar. The output of the softmax process at each radar is quantized through the pyramid vector quantization with a finite number of bits and is reported to the data fusion center. To improve the classification performance at the fusion center, the channel resources of the backhaul are adaptively allocated based on the classification separability at each radar. The proposed classification performance was assessed with synthetic simulation data as well as experimental data measured through the USRP-based MD radar.


Introduction
Recently, micro-Doppler (MD) radars have been deployed to detect or recognize target motion due to their low implementation cost and robustness to harsh environmental conditions [1][2][3][4]. Specifically, the target motions such as rotations and vibrations cause small-scale frequency shifts of a radar echo signal, which represent spectrograms that generally differ according to different target motions, also known as MD signatures. Furthermore, MD signatures caused by target motion can be captured through the MD radar whose operation is not affected by the surrounding conditions (e.g., bad weather or light intensity). Accordingly, MD radar-based motion detection and recognition are applied to UAV identification systems [5], human activity classification and smart homes [1][2][3]. However, the MD signature is generated based on the radar echo signal from unstable scattering points of the target and heavily dependent on the aspect angle of the target relative to the radar's line of sight [6].
To overcome it, MD signatures for multiple perspectives are collected by using multiple radar nodes or a single monostatic radar that traverses the target [7]. In Özcan et al. [8], it is verified that the distributed radar configuration outperforms the colocated MIMO radar configuration because the former obtains multi-aspect MD signature data. However, in open literature addressing the distributed radar configuration for motion recognition [9,10], the backhaul link for distributed radar nodes to share their sensing data (i.e., radar echo signals) is assumed to be wired and have an ideal unlimited capacity, which hinders flexible deployment of the distributed radar system.
In this paper, to get multi-aspect MD signature data, a distributed MD radar system with limited backhaul link is considered, where the distributed radars cannot report the high-dimensional data of MD signature to the fusion center due to the limited backhaul link. Note that the dimension of the MD signature depends on the size of short-time Fourier transform (STFT) and the observation time duration (for example, in our experiment, 64 point STFT was exploited and 2534 time samples were observed for one MD signature; then, the dimensions of the MD signature were 1.62 × 10 5 , which could cause the significant overhead to the typical limited backhaul link). Accordingly, for the distributed radar, the dimensions of the MD signature were reduced by exploiting the generalized singular value decomposition (GSVD)-based linear discriminant analysis (LDA) algorithm, which has been successfully exploited in the dimension reduction related applications [11][12][13][14]. Then, we developed the motion classification algorithm by exploiting the dimensionally-reduced MD signature for multiple perspectives collected at the data fusion center. Even though the dimensions of the MD signature data were reduced, a naive element-wise quantization of the dimensionally-reduced MD signature data incurred significant distortion. Accordingly, rather than transmitting the dimensionally-reduced MD signature data, we propose the softmax processing method in which the distances of sensed MD signature from the centers of clusters for all motion candidates are computed at each radar, and the distance can be expressed as a probability through the softmax function. Then, the output of the softmax process at each radar is quantized through the pyramid vector quantization (PVQ) with a finite number of bits and is reported to the data fusion center. We note that PVQ was developed for the data compression and quantization of Laplacian-like data with arbitrary vector dimensions [15,16]. In addition, PVQ is an efficient quantization method when the sum of vector components is fixed as a constant, where the codebook of PVQ consists of cubic lattice points on the L-dimensional pyramid. Accordingly, by exploiting the normalizing the constant, we exploit the PVQ to quantize the output of the softmax process. To improve the classification performance at the fusion center, the channel resources of the backhaul link are adaptively allocated based on the classification separability at each radar. Specifically, by allocating more bits (i.e., more resources of backhaul link) to the distributed radar with larger separability, the motion classification performance at the data fusion center can be further improved. The proposed classification performance was assessed with synthetic simulation data (MNIST hand writing data [17]) as well as experimental data measured through multiple MD radars that we implement by exploiting the USRP N210 devices with CBX daughterboards [18].
The rest of this paper is organized as follows. In Section 2, we introduce the system model for a distributed MD radar system and the associated signal model. In Section 3, MD signature-based motion classification using GSVD/LDA is developed for a single radar. In Section 4, we discuss the motion classification at the data fusion center by exploiting the dimensionally-reduced data reported from the distributed radars. In addition, we also propose the softmax processing method to further reduce the burden of backhaul linking and discuss the application of PVQ to the output of the softmax process at each radar. In Section 5, we provide several simulation results and experimental results. In Section 6, we give our conclusions. Figure 1 shows the distributed MD radar system with limited backhaul link, where L distributed radars transmit continuous waveforms with different center frequencies. Since the radars are spatially distributed, for the same target motion, each radar may receive the echo signal conveying different MD signature. When considering D target motions to be classified, the discrete received signal of the lth radar for the ith motion (i = 1, . . . , D) can then be expressed as follows:

System Model
for n = 1, . . . , N, where T s is a sampling time and ν (l) [n] is the zero-mean additive white Gaussian noise having a variance σ 2 n . Here, α (l) i represents the aggregated channel gain, including path-loss and antenna gain, and τ (l) is the time delay associated with the range between the target and the lth radar. In (1), ∆ f (l) i is the micro-Doppler shift characterized by the ith motion. In general, this MD signature is a function of time and can be clearly observed in time-frequency domain using short-time Fourier transform (STFT) [4].
Accordingly, the STFT of y i )), and STFT can be given as where w(m) is the window function with a length of M. In (1), F From (2), F (l) ij and N (l) ij can be vectorized in a similar way as (4), and we can have Throughout the paper, it is considered that the received radar signal at the lth distributed radar can be preprocessed and reported to the data fusion center through the limited backhaul link with the capacity of N l bits per channel use. That is, N l bits can be transmitted error-freely for one channel use of the lth distributed radar. In addition, it is assumed that ∑ l=1,...,L N l ≤ N total .

MD Signature-Based Motion Recognition Using GSVD/LDA in a Single Radar
LDA has been used for the signal identification/classification dealing with highdimensional data efficiently [11][12][13], in which a linear transformation matrix G ∈ C MN×M d is computed based on given sample dataset to reduce the dimension of the high-dimensional dataset and simultaneously maximize the separability along different classes, where For the motion recognition in the lth radar, the MD datasets are collected for each motion. Specifically, by referring to (4), let us denote Y (l) ij ∈ C MN×1 as the jth sample for the ith motion's MD signature data. Then, the collected datasets can be formulated as where A iN s ] and N s is the number of samples per cluster (i.e., one target motion). In addition, by using the collected datasets, three scatter matrices are computed as which are respectively called within-cluster, between-cluster, and total scatter matrices [11].
Here, m (l) i is the average of the samples in the ith cluster, given as m total is the average of total samples in the collected datasets, given as m i . Note that the trace of S w implies the variance of sample data within the same cluster, while the trace of S b is the variance of the cluster mean vectors (i.e., m (l) total , providing a measure of the distance between clusters. We note that, at the lth radar, by evaluating the Euclidean distance of the received data vector from the representative vectors m (l) i , the motion can be identified thusly: However, the process dealing with high dimensional data vectors requires high computational complexity.
For the dimensionally-reduced sample dataỸ , the scatter matrices can be given as Note that it is desirable to maximize the trace ofS w for the motion recognition in the reduced dimensional space. Accordingly, the optimal linear transformation matrix G at the lth radar can be found thusly: From (5), the scatter matrix in (8) can be rewritten as where S w (l) is the within-cluster scatter matrix excluding the channel gain and the noise.
We can also derive S b (l) = |α . Accordingly, (15) can be given aŝ We note that the optimal transformationĜ (l) in (17) can be obtained by the generalized eigenvectors associated with the M d largest generalized eigenvalues of the matrix . Furthermore, it can be efficiently computed through the GSVD algorithm [11,13,14], which is modified to our motion recognition scenario in Algorithm 1.
Compute the SVD of where P ∈ C D(N s +1)×D(N s +1) and U ∈ C MN×MN are orthogonal and Λ is an s × s diagonal matrix. Here, s is an effective rank of Z.

3.
Partition the matrix P as where P 11 ∈ C D×s , P 12 ∈ C D×(D(N s +1)−s) , P 21 ∈ C DN s ×s , and P 22 ∈ C DN s ×(D(N s +1)−s) are submatrices of P; and compute the orthogonal matrix V from the SVD of P 11 (= WΣV H ).

4.
Compute X as Then, set the transformation matrixḠ (l) as where [A] i denotes the ith column of a matrix A.
OnceḠ (l) is obtained in (23), we can transform the MD signature data in (4) into a lower dimensional space. Accordingly, the motion can be identified viâ where z (l) i is denoted as the distance of each piece of MD signature data from the center of a cluster, given by Here,m (l) i is the representative vector for the ith motion in the reduced dimensional space, given asm

Remark 1.
Note that, in (18) and (19) In addition, from (20) and (21), Then, from the SVD of P 11 and [11,19], we can have where P 21 =WΣV H . Equivalently, Therefore, the columns of X are the generalized eigenvalues of the matrix . (15) is denoted as the separability. That is, when J (l) (Ĝ (l) ) is larger, the dimensionally-reduced sample data vectors are well clustered and the motion recognition performance is more improved. Furthermore, from (17), the separability is , which implies that the received SNR at each radar affects the motion recognition.

Motion Classification at the Data Fusion Center
As the MD signature generated from the radar echo signal is heavily dependent on the aspect angle of the target relative to the radar's line of sight, the multi-aspect MD signature data can be exploited for the motion classification. However, when the dimensionailty of the MD signature data is large, the burden on the backhaul link increases. Accordingly, in this section, we describe how the dimensionally-reduced data is reported to the data fusion center effectively and how the reported data can be processed for the motion classification at the data fusion center.

Strategy 1: Motion Classification at the Data Fusion Center with the Dimensionally-Reduced Data
In Section 3, we can effectively reduce the dimension of the MD signature data while maintaining the separability by using the GSVD/LDA algorithm. Accordingly, the lth radar can transmitḠ (l)H Y (l) to the data fusion center, rather than transmitting the MD signature Y (l) .
Again, the average of the samples in the ith cluster, m i , can be computed at the data fusion center as and it can be denoted as the representative vector for the ith motion at the data fusion center. Therefore, whenḠ (l)H Y (l) test is received from the lth radar for a certain motion, Y test is formulated as and the motion can be identified such aŝ We note that the dimension of Y test can be further reduced through the linear transformation matrix G ∈ C LM d ×M d , which can be found by applying the GSVD/LDA algorithm to (29).

Strategy 2: Motion Classification at the Data Fusion Center with Softmax-Processed Data
Even though the dimensions of the MD signature data are reduced via strategy 1, the elements of G (l)H A (l) d have continuous complex values, which should be properly quantized when the backhaul link has a limited capacity. From (6), N l bits can be transmitted error-freely for single-channel use. A naive element-wise quantization incurs a significant distortion on the dimensionally-reduced MD signature data. Note that from (24), the cluster index of the data is determined to have the shortest distance from the center (i.e.,m (l) i ) of each cluster. Accordingly, rather than transmitting the dimensionally-reduced MD signature data, by reporting the distances from the centers of clusters (i.e., z (l) i in (25)), each radar can deliver useful information for cluster selection with limited resources.
To transmit z (l) i to the data center for the motion classification, we use the softmax process, which is widely applied to various multiclass classification problems, such as multinomial logistic regression and multiclass linear discriminant analysis [20]. The output of the softmax process at the lth radar is then given by for i = 1, . . . , D. At the fusion center, the motion can be identified such aŝ To improve the classification performance, because the cost function J (l) (G) indicates separability at the lth radar from Remark 2, the motion can be identified by maximizing the separability-weighted output of the softmax process: where w l = J (l) (Ĝ (l) ) .

Pyramid Vector Quantization and Bit Allocation for a Limited Backhaul
through the limited backhaul link with N l bit per channel use. We note that PVQ is an efficient quantization method when the sum of vector components is fixed as a constant, where the codebook of PVQ consists of cubic lattice points on the L-dimensional pyramid [15]. That is, the components of L-dimensional vector on pyramid are integer-valued and the sum of total components is fixed as an integer, K. Accordingly, by denoting N(L, K) as the set of codewords in the PVQ codebook, it is given as Throughout the paper, considering quantization of the softmax output in (32), we assume that x i are non-negative integers in (36). Then, the number of codewords in N(L, K), P(L, K), can be computed as Cadel and Parladori [16] or where ( a b ) is the binomial coefficient given as = a! b!(a−b)! . Then, the number of required bits to transmit the codewords in N(L, K) is given as log 2 P(L, K) , where · is the ceiling operation. For example, the codebook of N(3, 4) is shown in Figure 2, where the codewords are in the three dimensional space and the total number of codewords is given as P(3, 4) = 15. Accordingly, four bits are required to transmit the codeword per channel use. Throughout the paper, considering quantization of the softmax output in (32), we assume that x i are non-negative integers in (36). Then, the number of codewords in N(L, K), P(L, K), can be computed as [16] or P(L, K) = P(L − 1, K) + P(L, K − 1), To exploit the PVQ to quantize P (l) in (35), the normalized PVQ codebook is defined asN To exploit the PVQ to quantize P (l) in (35), the normalized PVQ codebook is defined as

Bit allocation for limited backhaul link 164
From (37) and (39), it can be found that, as the number of bits increases, the associated K can be increased. That is, the Euclidean distance between the codewords is reduced, resulting in the reduction of quantization errors. Accordingly, by allocating more bits (i.e., more resources of backhaul link) to the distributed radar with larger separability, the motion recognition performance at the data fusion center can be further improved. Specifically, from (6), the number of bits per channel use for the lth radar, N l , is determined as where w l is defined in (34) and · is the rounding operation. By allocating more 165 resources of backhaul link to the radar with a larger separability value, the data from 166 that radar can be exploited at the data fusion center with smaller quantization error. problem with MNIST hand-writing data [17]. We note that MNIST data set consists 177 of hand-writing numbers from 0 to 9, which is widely used as pilot data for image 178 classification including the deep learning system [21]. Specifically, in this subsection, the When the output of the softmax process at the lth radar is given as P (l) and the N l bits can be transmitted per channel use of the backhaul link, from (37), we can designN(D, K l ), where K l is determined as the maximum K that satisfies the condition of P(D, K) ≤ 2 N l . Then, P (l) can be quantized asĉ which can be forwarded to the data fusion center. Then, at the data fusion center, the motion can be identified by maximizing the separability-weighted output of the softmax process: which is analogous to (34).

Bit allocation for Limited Backhaul Link
From (37) and (39), it can be found that, as the number of bits increases, the associated K can be increased. That is, the Euclidean distance between the codewords is reduced, resulting in the reduction of quantization errors. Accordingly, by allocating more bits (i.e., more resources of backhaul link) to the distributed radar with larger separability, the motion recognition performance at the data fusion center can be further improved. Specifically, from (6), the number of bits per channel use for the lth radar, N l , is determined as where w l is defined in (34) and · is the rounding operation. By allocating more resources of backhaul link to the radar with a larger separability value, the data from that radar can be exploited at the data fusion center with smaller quantization error. Interestingly, from Remark 2 and (17), when the separability excluding the channel gain and the noise (i.e., the ratio of the traces of two matrices, S w ) are the same for all the radars, the radar with higher received SNR can be allocated more backhaul link resources.

Simulation with MNIST Hand-Writing Data
Before implementing the motion classification using distributed MD radars with limited backhaul, we first verified the classification performance of the GSVD/LDA-based dimension reduction in a distributed system by applying it to the image classification problem with MNIST hand-writing data [17]. We note that the MNIST dataset consists of hand-written numbers from 0 to 9; it is widely used as pilot data for image classification, including the deep learning system [21]. Specifically, in this subsection, the classification of the number set {1, 2, 3} from the hand-written images with 784 pixels is considered. Accordingly, the hand-written images with 784 pixels were exploited instead of the MD signature F (l) ij in (5). Throughout the simulation, the number of nodes was set to L = 3, and at each node, G (l) was computed by exploiting 1000 MNIST training data for each number. In Figure 4, the average classification rates are provided for strategy 1 in Section 4.1 and strategy 2 with (33) and (34) in Section 4.2 when (a) [α (1) , α (2) , α (3) ] = [0.14, 1, 1.4] and (b) [α (1) , α (2) , α (3) ] = [1, 1,1]. For comparison purposes, we also evaluate the averages of the classification rates at distributed nodes without sharing their sensing data to the fusion center. The quantization of the sharing data from each node to the fusion center through the backhaul link was not considered.
As shown in Figure 4a, it is obvious that the classification rates at the fusion center are better than those of each node without sharing the sensing data. Interestingly, strategy 2 with (34) (i.e., using the separability-weighted output of the softmax process) outperformed other schemes, because the path-loss (or SNR) affects the separability at each node, as discussed in Remark 2, and the classification quality at each node can be reflected on the final classification decision when strategy 2 with (34) is exploited. In Figure 4b with the same path loss, the performances of strategy 2 with (33) and (34) are almost the same.
The case of N total = ∞ implies that the backhaul link does not have any resource limitation and the sharing data from each node are unquantized and reported to the fusion center through the backhaul link without any distortion. From the figure, it can be found that, as N total increases, the classification performance improves. In addition, when the resources are allocated proportionally to the separability, as in (42), the classification performance can be further improved compared to that with equal resource allocation over the distributed nodes.

A Test with MD Signatures Measured through USRP-Implemented Radars
To verify the performance of the proposed LDA-based motion classification using the distributed MD radars, we have implemented the distributed MD radars by exploiting the USRP N210 devices with CBX daughterboards and log periodic PCB directional antennas [18], as in Figure 6, and the associated GNU-radio flowgraph is shown in Figure 7. Here, the carrier frequencies for three MD radars were set as [ f c ] = [4.1, 4.3, 4.5] GHz, and the sampling rate was set as 200 kHz. Note that we exploited the multi-frequencies with a large difference (i.e., 200 MHz) to avoid inter-radar interference without increasing the implementation complexity, but the frequency difference can be further reduced if the proper resource scheduling method is exploited. In addition, the center carrier frequency of 4.3 GHz was exploited as used in Liu and Chen [22] for the hand motion-aware radar, but the proposed scheme can be extended to other frequency bands without difficulty. Throughout the experiment, the number of time samples and the window size for the STFT were set as N = 2533 and M = 63.
For three different hand gestures-hand flip-flop, clapping, and fist-clap-the MD signatures were collected by exploiting the distributed MD radars, and the associated snapshots are provided in Figure 8. We note that the aspect angles of the same gesture were different for distributed radars, and accordingly, the MD signatures appeared differently at distributed radars, even for the same gesture.    In Table 1, the motion classification rates are listed for the experimental settings in Figure 6. Here, 120 MD signature samples per motion were collected in each radar, and half of these were randomly chosen and exploited to compute G (l) (l = 1, 2, 3), and the other 60 samples were used to test the classification performance. In Table 1, motions 1, 2, and 3 correspond to hand flip-flop, clapping, and fist-clap, respectively. As for the experimental results without the proposed strategies, we evaluated the classification performance at a single radar without data sharing to the fusion center. In addition, we evaluated the performance for the scenario in which the backhaul link between the distributed radar and fusion center was wired, implying that N total = ∞, denoting strategy 1 with N total = ∞. We note that the separabilities of the distributed radars are given as [0.0874 0.0633 0.1183], which implies that the MD signatures from the third radar are more beneficial for the classification. From the Table 1, it can be found that combining multi-aspect angles from distributed radars at the fusion center outperforms the classification at a single radar without data sharing to the fusion center. In addition, as the number of bits per channel use (i.e., the resource of the backhaul link) increases, the classification rate also increases. Importantly, when the resources are allocated proportionally to the separability, as in (42), the classification rate can be further improved compared to that with equal bit allocation over the distributed nodes, which coincides with the observation in the simulation with MNIST hand-writing data. Table 1. Classification rates for the experiment settings in Figure 6. To see the effect of the diversity of the multi-aspect angles on the recognition performance, we performed an additional motion classification experiment when the distributed radars were placed close together, as in Figure 9. Table 2 shows the classification rates for the experiment from Figure 9. The separabilities of distributed radars were [0.0426 0.1212 0.1272]. We can find a similar trend in Table 1, where combining multi-aspect angles at the fusion center outperformed the classification at a single radar, and when the resources were allocated proportionally to the separability, the classification rate could be further improved. Interestingly, the overall classification rates in Table 2 are worse than those in Table 1, because the diversity of the multi-aspect MD signature data is reduced when the distributed radars are placed close together. Specifically, when the distributed radars are placed close together, as in Figure 9, the radar signal received by each radar has a similar incoming angle from the target. In contrast, in Figure 6, the third radar can observe the target motion with a relatively different aspect angle compared to other distributed radars. Accordingly, the diversity of the multi-aspect MD signature data is decreased when the distributed radars are placed close together as in Figure 9; compare to Figure 6. This resulted in the degradation of the overall classification rates in Table 2 compared to those in Table 1. When the radars were co-located, the recognition rate of the proposed strategies was low, especially for motion 3, because motion 3 (fist-clap) had a similar MD signature to other actions. Note that this performance degradation can be overcome by increasing the diversity of multi-aspect angle as in Figure 6 Table 2. Classification rates for the experiment in Figure 9.

Conclusions
In this paper, we proposed an LDA-based motion classification algorithm using the MD signatures obtained from distributed MD radars, in which the the distributed radars are connected to the data fusion center through the limited backhaul link. Due to the limited backhaul link, at each radar, the dimensions of MD signature are reduced by using the LDA algorithm, and the dimensionally-reduced MD signatures from multiple perspectives can be collected at the data fusion center. To further reduce the burden of the backhaul link, we also propose the softmax processing method and that the output of the softmax process at each radar should be quantized through the PVQ with a finite number of bits and is reported to the data fusion center. To improve the classification performance, the channel resources of the backhaul link are adaptively allocated based on the classification separability at each radar. Through computer simulations and an experiment, we demonstrated that the proposed algorithm (i.e., LDA-based motion classification with softmax processing, PVQ, and the separability-weighted bit allocation) exhibits a considerable performance improvement in the limited backhaul link, which is comparable to that without any resource limitation in the backhaul link.