Classiﬁcation of Targets Using Statistical Features from Range FFT of mmWave FMCW Radars

: Radars with mmWave frequency modulated continuous wave (FMCW) technology accurately estimate the range and velocity of targets in their ﬁeld of view (FoV). The targeted angle of arrival (AoA) estimation can be improved by increasing receiving antennas or by using multiple-input multiple-output (MIMO). However, obtaining target features such as target type remains challenging. In this paper, we present a novel target classiﬁcation method based on machine learning and features extracted from a range fast Fourier transform (FFT) proﬁle by using mmWave FMCW radars operating in the frequency range of 77–81 GHz. The measurements are carried out in a variety of realistic situations, including pedestrian, automotive, and unmanned aerial vehicle (UAV) (also known as drone). Peak, width, area, variance, and range are collected from range FFT proﬁle peaks and fed into a machine learning model. In order to evaluate the performance, various light weight classiﬁcation machine learning models such as logistic regression, Naive Bayes, support vector machine (SVM), and lightweight gradient boosting machine (GBM) are used. We demonstrate our ﬁndings by using outdoor measurements and achieve a classiﬁcation accuracy of 95.6% by using LightGBM. The proposed method will be extremely useful in a wide range of applications, including cost-effective and dependable ground station trafﬁc management and control systems for autonomous operations, and advanced driver-assistance systems (ADAS). The presented classiﬁcation technique extends the potential of mmWave FMCW radar beyond the detection of range, velocity, and AoA to classiﬁcation. mmWave FMCW radars will be more robust in computer vision, visual perception, and fully autonomous ground control and trafﬁc management cyber-physical systems as a result of the added new feature.


Introduction
There exists a wide variety of sensors for sensing and perception of the surrounding environment, such as camera, LiDAR, ultrasound, infrared (IR), thermal cameras, radar, The fast Fourier transformation (FFT) on the IF signal provides range profile. The peaks in the range profile determines the radial range of the objects. In addition, timefrequency analysis techniques such as the Micro-Doppler have been investigated in some cases where targets have specific repeating patterns. This, however, increases the signal processing complexity, resulting in unacceptable latency in some application scenarios [17][18][19][20][21][22]. Furthermore, such techniques are limited to static targets [23,24]. Machine learning techniques have recently been investigated by using mmWave radars data. Surface classification with millimeter-Wave radar has been accomplished through the use of temporal features [25]. In [26], it was proposed to classify small UAVs and birds by using micro-Doppler signatures derived from radar measurements. The use of micro-Doppler spectrograms in cognitive radar for deep learning-based classification of mini-UAVs has been proposed in [27]. The cadence velocity diagram analysis has been proposed for detecting multiple micro-drones in [28]. Convolutional neural networks with merged Doppler images have been proposed in [8] for UAVs classification. The use of micro-Doppler analysis to classify UAVs in the Ka band has been proposed in [29]. The detection of small UAVs has been proposed using a radar-based EMD Algorithm and the extraction of micro-Doppler signals in [30]. The detection of small UAVs has been proposed using cyclostationary phase analysis on micro-Doppler parameters based on radar in [31]. UAV detection has been proposed by using regularized 2D complex-log spectral analysis and micro-Doppler signature subspace reliability analysis in [32]. A multilayer perceptron artificial neural network has been proposed for classifying single and multi-propelled miniature drones in [33]. It has been proposed to use FMCW radar to classify stationary and moving objects in road environments in [34]. The detection of road users such as pedestrians, cyclists, and cars using a 3D radar cube has been proposed by using CNN in [35]. CNN is used for classification, followed by clustering. Based on the Euclidean distance softmax layer, a method for classifying human activity by using mmWave FMCW radar has been proposed in [36]. Several deep learning-based methods for detecting human activity using radar are summarized in [37]. All of these works, however, used spectrograms or time-frequency representations derived from spectrograms, such as cepstrogram and CVD, which necessitates additional signal processing. The additional features of an intermediate frequency (IF) signal's range FFT profile have not been thoroughly investigated. It has been demonstrated in [38] that by utilizing the features from the range FFT profile, additional information about the objects can be extracted.
The ability to detect target features such as shape and size, as well as dynamic parameters of these targets, is critical. Such enhancements will improve the reliability and robustness of any system that utilizes radars. On the one hand, while the IF signal explicitly provides the object's range, the distinguishing characteristics of the different objects are obtained by extracting statistical parameters from the range FFT plot, such as peak height, peak width, standard deviation, and area under peaks. Experiments have been carried out in order to categorize three common objects: an unmanned aerial vehicle (UAV), a car, and a pedestrian. A number of ML algorithms are used to classify the targets in combination with statistic features extracted from the IF signal range FFTs of the radar measures with different objects. Lightweight machine learning algorithms that have been investigated include Logistic Regression, Naive Bayes, support vector machine (SVM), and Light Gradient Boosted Machine (GBM). This is the first paper to use ML for classifying purposes with mmWave radars on the range FFT statistical features. The major contributions of the work are as follows:

1.
Outdoor experiments have been carried out to categorize three common objects: an unmanned aerial vehicle (UAV), a car, and a pedestrian.

2.
Extracting statistical parameters such as peak height, peak width, standard deviation, and area under peaks from the range profile of the radar data.

3.
Classification of the targets by using the statistical features extracted from the IF signal range FFTs of the radar measures with different objects and various ML models.
In complex situations, however, range profiles may not provide higher classification accuracy. The combination of mmWave radars with additional sensors such as RGB cameras, thermal cameras, and infrared cameras improves reliability and classification accuracy.
The rest of this paper is structured as follows. The system is described in Section 2. The experimental setup is described in detail in Section 3. Section 4 presents the data set, signal processing, details of the machine learning models, and the performances. The detailed data set and algorithms are available at https://github.com/aveen-d/Radar_classification (accessed on 19 June 2021) [39].
Finally, the conclusion remarks together with possible future works are discussed in Section 5.

mmWave FMCW Radar and Data Acquisition
The measurements were taken outside by using a Texas Instruments complex base band FMCW mmWave radar (TI). The radar is equipped with four transmitting and three receiving antennas. The radar's front-end complex base band architecture is depicted in Figure 2. In Figure 3, the starting frequency ( f c ), bandwidth (BW), and chirp slope (S) during one chirp period (T chirp ) are shown. The transmitted chirp's instantaneous frequency is given by the following equation. The transmitted chirp's phase is given by the following equation.
Using (1) and (2), the transmitted chirp within a period (T chirp ) is given by the following equation: where f tr (t) represents the frequency of the transmitted chirp and φ tr (t) represents the phase of the transmitted chirp [40]. Similarly, the received signal following a remote target reflection is simply a delayed version of the transmitted signal and is given by the following: where τ = 2R/c represents the time delay between the transmitted and received signal, R represents the radial range of the target from radar, and c represents the velocity of light in a vacuum. The transmitting and receiving chirps patterns are depicted in Figure 3. The complex IF signal is created by combining the reflected chirp from the targets with the in-phase and quadrature-phase of the transmitted chirp, as illustrated in Figure 2. This complex IF signal is first processed with a low-pass filter before being digitized at a sampling rate of 10 Msps [2,40]. The frequency of IF signal is proportional to the radial range of the target and is given by (5).
Range is given by (6): where BW, R, f IF , c, and S represent the RF bandwidth, range, IF signal frequency, light velocity in vacuum, and chirp slope, respectively.

Radar Configuration Details
The mmWave radar configuration parameters are shown in Table 1. The raw ADC data of the complex IF signal are obtained from the radar and then post-processed in MATLAB in order to separate the data files for the four channels in the frame structure, as shown in Figure 4. Each measurement consists of 200 frames. Each frame is composed of 128 chirp loops, each of which contains 256 samples.   The clutter is removed during preprocessing. Radar clutter is classified into two types: mainlobe 158 clutter and sidelobe clutter [42]. The mainlobe clutter is caused by unwanted ground returns within 159 the radar beamwidth (mainlobe), whereas the sidelobe clutter is caused by unwanted returns from any 160 other direction outside the mainlobe. When the radar is placed at a lower height from the ground, the 161 main lobe / sidelobe intersects the ground. Because the area of ground in the radar beam is often quite 162 large, the ground return can be much larger than the target return. The clutter associated with ground 163 returns close to radar is removed by removing associated components per range bin in range FFT.

Range FFT
The FFT algorithm converts time-domain sampled complex IF signal data to frequencydomain. Each chirp/frame is processed to obtain the range FFT spectrum. After that, the range FFT is converted to an amplitude (dBFS) versus range (m) plot, where (6) can be used to calculate the range in meters from the frequency, and dBFS denotes the decibel full scale value of the signal amplitude. This range FFT plot is further processed by using peak detection algorithm. The peaks in the range FFT spectrum represent targets in the mmWave radar's field of view.
The clutter is removed during preprocessing. Radar clutter is classified into two types: mainlobe clutter and sidelobe clutter [41]. The mainlobe clutter is caused by unwanted ground returns within the radar beamwidth (mainlobe), whereas the sidelobe clutter is caused by unwanted returns from any other direction outside the mainlobe. When the radar is placed at a lower height from the ground, the main lobe/sidelobe intersects the ground. Since the area of ground in the radar beam is often quite large, the ground return can be much larger than the target return. The clutter associated with ground returns close to the radar is removed by removing the associated components per range bin in range FFT.

Features Extraction
Feature extraction details are presented in this section. The range FFT plot is used to identify peaks, and then features for each peak are extracted. Among the features derived from the detected peaks in the FFT spectrum are the radial range of the target, the height of the peak, the peak width, the standard deviation, and the area under the peak. In general, only peaks are used to determine whether or not a target is present in the radar's field of view [42][43][44][45]. Although other target parameters such as velocity and angle of arrival can be extracted from the radar measurements, target features such as size and shape cannot be estimated. However, targets can be classified by combining the aforementioned range FFT features with lightweight machine learning models.

Machine Learning Models
Once features are extracted, light weight machine learning techniques such as Logistic Regression, Support Vector Machine, Light Gradient Boost methods, and Naive Bayes are used. These machine learning models, as well as their key performance outcomes, are elaborated in detail in Section 4.

Target Classification
Three common targets such as a car, a pedestrian, and a UAV, are classified by using the extracted range profile features and lightweight machine learning models. By taking measurements with the targets of interest, additional targets can be added to the model.

Measurements and Signal Processing
The measurement setup is lightweight and portable. It is made up of a mmWave FMCW radar with three transmitters and four receivers that operate in the frequency range of 77 to 81 GHz. The Texas Instruments' mmWave Studio application is used to configure and control the radar setup. The configuration parameters of the radar used in these measurements are shown in Table 1. The algorithm used for the feature extraction of the objects is shown in Algorithm 1. A flowchart is shown in Figure 5 to explain the algorithm. Measurements are made with three common objects in an outdoor environment, as shown in Figure 6. Drone used in the measurements is quite small in size, and it possesses a size of 214 × 91 × 84 mm when folded and 322 × 242 × 84 mm when unfolded. The vehicle used was a medium-sized automobile with dimensions of 4315 × 1780 × 1605 mm. Measurements for the pedestrian were taken with a 172 cm tall adult. All three objects are one of a kind, with distinct shapes and sizes. For each object, several measurements were taken in small range steps up to a range of 25 m, which was the measurement scene's limitation. The radar station is fixed and objects were moved from the radar in small steps while taking the measurements. The data collected using mmWave sensor are arranged for four channels, and post processing is performed on 200*128 chirp loops of a channel. A Fast Fourier Transform is applied on these chirp loops consisting of 256 samples/chirp loop. Further dBFS and a mean of dBFS of all these chirploops is calculated for 256 samples. The mean dBFS vs. distance plot is obtained using MATLAB. The highest peak in the plot will indicate the object location. A sharp peak can be obtained after the removal of a static plane. The features of the highest peak are extracted from this plot. This work has established a relationship between these extracted peak features and the object. This relationship is used to identify the type of object present in the vicinity of the mmWave sensors. All the extracted features from the measurements are shown in Figure 7. It is clear from Figure 7 that features extracted from the range FFT plot, such as standard deviation of the peak, area under the peak, the peak width, and the peak height, provide distinguishable information about the targets. This makes sense because targets with a large cross-section reflect more power and, as a result, larger peaks in the range FFT plot.     f eatures(dis, ht, wd, ar, std) ← f indpeaksSb(dBFS i ) Figure 8 depicts a single outdoor measurement case for three targets: a car, a pedestrian, and an UAV (drone). According to Figure 8, the areas under the peak extracted from range FFT for a car, a pedestrian, and a drone are 2.5984, 2.038, and 0.45673, respectively. It is proportional to the cross-section of the targets. Similarly, peak height, standard deviation, and width are also proportional to the target features such as shape. All of these extracted features are further processed by using machine learning techniques for target classification.

Models
A machine learning model is depicted in Figure 9. Each of the three classes has 226 samples in our data set. Each sample has five properties: the target's radial range (m), the area under the peak (dBFS × m), the peak's height (dBFS), the peak's width(m), and the standard deviation (m) of the peak in the IF signal's range FFT. The Human, Drone, and Car are the class labels. Table 2 displays the sample count for each class.  The dataset is divided into two sections: training and testing. The training set consists of 90% of the samples of the total dataset. The testing set contains 10% of samples of the total dataset. Then, by using our dataset, we compare the performance, size, and other parameters of various machine learning models.

Logistic Regression
The probabilities for classification problems with two possible outcomes are modeled by using logistic regression. It is a classification-problem extension of the linear regression model. Logistic regression is a supervised ML model based on logistic function. This ML technique is useful to predict the binary decision variables {0,1}. There is only one node and two operations: (i) a linear combination of model parameters such as weights and bias and input (7); (ii) non-linear activation, which in this case is a sigmoid function (8). The model then computes the probability 'p' that it belongs to the specified class following the second operation [46]. The logistic regression model calculates the probability 'p' as shown in Equation (9). In (7), 'w' is the weight vector, and 'x' is one sample vector. In (9), 'Y' is the class label, and 'X' is the given dataset. In its most basic form, logistic regression is used to classify only two classes. A multi-class classification model is used as our dataset has three classes. For classification, we employ the one versus all method, also known as the one vs. rest method. By this approach, we generate 'n' classifiers associated with 'n' classes. We choose one class as class '0' and all other classes as class '1' from the dataset for each classifier. The logistic regression model is then used to distinguish between classes '0' and '1'. The same procedure is used to process the remaining 'n' classifiers in the dataset [47]. The logistic regression model and its confusion matrix for our dataset 'n' = 3 is shown in Figures 10 and 11 [48], respectively.

Naive Bayes
Naive Bayes (NB) is a type of generative machine learning model. The discriminative models are designed to learn the probability distribution P(y | x) given the input x and corresponding label y. The generative ML model, on the other hand, estimates the joint probability P(x,y) and applies the Bayes theorem to obtain P(y | x). The NB algorithm is a popular supervised ML algorithm for dealing with classification problems. This algorithm is based on the assumption that features are conditionally independent of one another [49]. The NB algorithm has three variants based on the input features: (i) input with binary features; (ii) input with discrete features; and (iii) input with continuous features. For input of binary type features, Bernoulli NB is used, Multinomial NB is used for the input type of discrete features, and Gaussian NB is used for the input type of continuous features. We used the Gaussian NB model because our features were continuous. First, we computed the likelihood ratios from our dataset. Following that, the posterior probability for each class is computed as shown in (10). In (10), 'Y' is the class label, 'X' is the given dataset, 'p(Y/X)' is the conditional probability of 'Y' given 'X', and 'p(X)' is the marginal probability of 'X'. The sample is a member of the class with the highest posterior probability. Figure 12 shows the Naive Bayes model. Figure 13 depicts the Naive Bayes model's confusion matrix.

Support Vector Machine (SVM)
The Support Vector Machine model was the next model we investigated for our dataset. By locating a hyperplane between the classes, this model generates a classifier. A hyperplane is a plane that separates two classes with the greatest possible margin. The classes are separable with both linear and non-linear methods. Hyper-planes are easily found in linearly separable cases. SVM uses a kernel to convert non-linearly separable classes to linearly separable classes by converting low dimension input space to higher dimension space [50]. The SVM model uses the Lagrangian method and dual problem formulation for the model's optimization. The Lagrangian function and the dual problem formulation are shown in the (11) and (12). In (11) and (12), 'w' is the weight vector, 'α' is the lagrangian multiplier, 'y i ' is the class label, 'x i ' is the given sample, 'm' is the total number of samples, and 'b' is the bias term. In (12), 'K(x i , x j )' is the kernel term, and we used the 'RBF' kernel in this work as defined in Equation (13). In (13), 'γ' is called as the kernel coefficient. After calculating the optimal 'w' and 'b', the model classifies a sample using Equation (14). In its most basic form, SVM is a binary classification model. Thus, in order to make it work with our dataset, we used the one vs. all multi-class classification method described in the previous section. The model's hyperparameter values are as follows: 'C' = 1.0 and 'kernel' = 'rbf'. The SVM models and its confusion matrix are depicted in Figures 14 and 15 respectively.

Light Gradient Boost Methods
The Light Gradient Boost method is the next machine learning model used (Light GBM). Light GBM is currently one of the most powerful performance enhancing algorithms available. A decision tree algorithm is used in this method. Unlike other boosting algorithms that divide the decision tree level-wise or depth-wise, Light GBM divides the decision tree leaf-wise. This leaf-wise split can reduce loss but can also result in overfitting. Since the model contains a hyper-parameter, it can control the depth of the tree for the leaf-wise split to avoid overfitting [51]. The split is made by calculating the residual value for each leaf as shown in Equation (15). Since we restrict the number of leaves that will be present, we cannot directly sum the residuals of all leaves, instead we uses the gradient boosting transformation technique shown in Equation (16). In Equation (16), 'γ' is the transformation value, 'r' is the residual of each leaf, and 'p' is the previous predicted probability for each residual. Thus, we transform the tree by this method. When compared to other machine learning algorithms, this model is extremely fast. This model is built with the available 'lightgmb' library. The following are the various hyper-parameter values for the model: 'boosting type' = 'gbdt'; 'objective' = 'multiclass'; 'metric' = 'multi logloss'; 'sub feature' = 0.5; 'num leaves' = 10; 'min data' = 50; 'max depth' = 10; and 'num class' = 3. Figure 16 depicts LGBM model. Figure 17 depicts the confusion matrix for the Light GBM model on test data.

Performance Evaluation
On the test dataset, the performance of the four deployed models is compared using four evaluation metrics. Each of the evaluation metric consists of the following elements: 'True Positive (TP)', 'True Negative (TN)', 'False Positive (FP)' and 'False Negative (FN)' [52]. They are detailed below.

Accuracy
The accuracy [52] of all the four models along with their inference time and model size is shown in Table 3. The accuracy is calculated according to Equation (17). It can be observed from Table 3 that the LightGBM method provides the best accuracy of 95.6%. For all models, the inference time is under 0.5 ms. Accuracy = (TP + TN)/(TP + FP + TN + FN)

Recall
The recall [52] of all four models is shown in Table 4. The recall is calculated according to Eqaution (18). From the above table (Table 4), it can be observed that Light GBM model performs the best for all the classes. Recall = (TP)/(TP + FN)

Precision
The precision [52] metric for all the models is shown in Table 5. This metric is calculated according to Equation (19). It can be observed from the table that Light GBM model outperforms over all the other models for all the three classes. Precision = (TP)/(TP + FP) The F1-score [52] metric is calculated and shown for all the models in Table 6. This metric is calculated according to Equation (20). It can be observed from Table 6 that Light GBM model values are the best as compared to other models for all the classes. F1-score = 2 × (Recall × Precision)/(Recall + Precision) (20)

Conclusions
In order to identify targets by using mmWave FMCW radars, a novel classification technique based on statistical features from a range profile has been proposed. The proposed method should be extended to include long-range targets as well as targets of various types with different shapes and sizes. The range profile may lack distinguishable features for long-range targets and targets with small cross sections, necessitating additional signal processing before applying machine learning. In addition to the features presented here, micro-Doppler features and various time-frequency plots can be incorporated into models to effectively classify the targets if any of the targets have vibrating parts or repeating patterns. In order to improve the robustness of the classification technique, the range-Doppler and range-azimuth plot features can be incorporated into the model.