Localization of Multi-Class On-Road and Aerial Targets Using mmWave FMCW Radar

mmWave radars play a vital role in autonomous systems, such as unmanned aerial vehicles (UAVs), unmanned surface vehicles (USVs), ground station control and monitoring systems. The challenging task when using mmWave radars is to estimate the accurate angle of arrival (AoA) of the targets, due to the limited number of receivers. In this paper, we present a novel AoA estimation technique, using mmWave FMCW radars operating in the frequency range 77–81 GHz by utilizing the mechanical rotation. Rotating the radar also increases the field of view in both azimuth and elevation. The proposed method estimates the AoA of the targets, using only a single transmitter and receiver. The measurements are carried out in a variety of practical scenarios including pedestrians, a car, and an UAV, also known as a drone. With measured data, range-angle maps are created, and morphological operators are used to estimate the AoA of the targets. We also process radar range-angle images for improved visual representation. The proposed method will be extremely beneficial for practical ground stations, traffic control and monitoring frameworks for both on-ground and airborne vehicles.


Introduction
Radars are used in several applications both in automotive and industrial sectors. Radars for automotive applications are summarized in [1]. Applications of biomedical MIMO radars are summarized in [2]. Recently, radars were explored for air-conditioning systems [3]. Radars for medical applications are summarized in [4]. Human localization and vital signs measurements are explored, using hand-held through-wall imaging radar, in [5]. The detection and ranging of human targets in cluttered environments is proposed in [6]. The indoor human localization and life activity monitoring is proposed in [7]. mmWave radars are explored in several UAV applications [8][9][10][11][12][13][14][15]. However, radars for UAV ground station monitoring and control application is still challenging, as it needs to localize and be able to track the small dynamic UAVs in a wide field of view (theoretically 360 degrees field of view). The mmWave radars offer very high resolution, due to their large bandwidth. These radars are extremely compact, small in size and have very low power consumption. The angle of arrival (AoA), range and velocity of the targets can be estimated using them in their direct field of view (FoV). Targets at distances in the range of 300-400 m [16] can be easily detected by them at an operating frequency of a couple of GHz.
An object's range and velocity, in particular, can be accurately determined. Only a single transmitter and receiver are required to estimate target range and velocity. However, the performance of the range and velocity estimation is dependent on the chirp configuration parameters. The number of receiving antennas, on the other hand, has a large impact on the accuracy and resolution of the targets' AoA. The combination of compressive sensing and multiple input and multiple out (MIMO) was shown to improve the angular resolution [12]. The greater the number of receiving antennas, the better the performance. The concept of MIMO is utilized to form a large number of virtual Tx-Rx pairs with limited number of Tx and Rx antennas [17]. To achieve 1 degree of angular resolution, at least 115 virtual Tx-Rx pairs are required-leading to quite a large number of physical transmitters and receivers. This also increases the hardware complexity and associated signal processing chains. As a result, mmWave radars use a small number of transmitting and receiving antennas to reduce cost and complexity.
The FoV can only be enhanced in one of two directions, due to transceiver antenna limitations: elevation or azimuth. Its main purpose is to increase the field of view in the azimuth direction, which is important for many applications. However, mmWave modules used in traffic management systems and installed as ground stations require a wide field of view in both the azimuth and elevation directions. A two-dimensional antenna array can help in widening FoV in both the elevation and azimuth directions. However, as the number of transceiver antennas grows, so does the complexity, computational latency, and cost.
An angle delay estimation method based on extended one-dimensional pseudospectrum searching is proposed in [18] The number of targets used are only two in this study, and it is computationally rigorous. The complexity of this method necessitates additional research for more than two targets. Simulating a complex scenario cannot be made possible, as the measurements in [18] were not taken in an open environment. In [19], it is proposed to develop a two-dimensional parameter estimator that combines both the extrapolated fast Fourier transform (FFT) as well as multiple signal classification (MUSIC). This study, however, does not take into account a complex scene. In [20], the range and angle estimations are proposed, employing signal parameter estimation through rotational invariance techniques (ESPRIT). Even when the number of targets exceeds the number of receivers, the proposed algorithm works. All the proposed methods need to have at least two receivers for the estimation of AoA of targets. The angle resolution improves as the number of receiving antennas increases.
A 2D synthetic-aperture radar (SAR) imaging is employed with FMCW radars [21]. The image is reconstructed using a two-dimensional FFT or range Doppler plot. Because of the fixed horizontal and vertical movement, this approach can capture the target in a constant FoV. This limits the user's ability to capture a variety of scenarios. However, localization of multiple targets remains challenging.
In [22], a fan-beam antenna is used to implement a three-dimensional (3D) view of mmWave radar, finding its application in mobile robotics. There is a lack of information on AoA and additionally, there are limitations regarding the measurements in range and velocity. In [23], it was proposed to use 3D near-field imaging for robotics and security scanning. In this work, they combine LiDAR data, which adds to the computational complexity and delay. In [24], a synthetic aperture mmWave ground station search and track radar is proposed. It has numerous drawbacks, such being large, complicated, and lacking target AoA calculation for rotating radars. The rotating FMCW radar is used for localization and mapping. The task is used to determine the target's range and velocity. The majority of these works have not concentrated on multiple target localization, which is crucial for a wide range of applications. A mechanical scanning FMCW radar is proposed in [25]. It uses the bandwidth of 400 MHz only. However, detailed experiments and automatic angle estimation for multi target scenarios are still missing. A 3D millimeter wave system is proposed in [26] for robotic mapping and localization, as well as security scan applications. It mainly focuses on indoor short-range applications. Practical multitarget outdoor scenarios still need further investigation.
To address the aforementioned issues, we present a rotating mmWave FMCW radar capable of detecting the target range and AoA. We utilize range-angle maps computed from a 1D range FFT profile for localization. It is possible to obtain target features, such as distance and velocity [27,28], accurately with fixed positioning of radars, but it is difficult to accurately estimate AoA of the targets with a limited number of receiving antennas. Using our AoA estimation approach, we can locate and estimate the AoA of objects in a wide field of view. This FoV is also configurable, allowing it to be adjusted to the demands of the application. The major contributions of this paper are as follows: • We propose AoA estimation of multi-class targets by mmWave FMCW radar measurements in a practical outdoor setting. • The proposed method just requires only 1 Tx antenna and 1 Rx antenna for the localization of multi-class targets. • The proposed localization method using mmWave FMCW radar achieves a large FoV in both azimuth and elevation directions. • The proposed method estimates the AoA of both on-road and aerial targets, using morphological operators on range-angle maps. • The proposed method improves the visual representation of multi-class targets, using range-angle images.
The paper is further organized as follows. Section 2 discusses the details of the radar system. Section 3 elaborates on the measurements and signal processing. Section 4 presents the angle of arrival estimation of multiple targets, using morphological operators on rangeangle maps. Results are discussed in Section 5. Finally, in Section 6, the conclusion and potential future works are discussed.

RF Signal Generator
In Equation (1), f s is the chirp's starting frequency, φ i is the initial phase of the chirp, β is the slope of the chirp, and β is given by the following: f f is the chirp's final frequency, and T c is the chirp time over which the chirp's frequency changes from f s to f f . The transmitted chirp's frequency is given by Equation (3): The received signal, R(t) reflected off the distant targets, is a delayed version of the transmitted signal, T(t). R(t) is denoted by the following: The round trip time delay, τ is defined by the following: where R denotes the distance of the detected target from radar, and c denotes the speed of light. The reflected signals from distant targets are mixed with the transmitted signal's in-phase and quadrature phase, and the complex IF signal is generated as shown in Figure 1. This IF signal is processed further and digitized, using ADCs at a sampling frequency of 10 MSPS [32]. This IF signal's frequency is proportional to the range of the target in direct line of sight (D-LOS) that reflects the transmitted chirp by Equation (6).
where R, f IF , and c are the range, intermediate frequency, and velocity of light in vacuum, respectively. Range profiles are computed from the measured raw data and further processed to obtain the range-angle maps [30].

Outdoor Measurements and Pre-Processing
The measurements are taken in a realistic outdoor setting with a multiple targets in the scene. The raw radar data are used for the creation of range-angle maps of all measurement scenes. These range-angle maps are further processed using morphological operators. Several measurement scenes were captured; the summary of all the measurement instances can be found in Table 2. For instance, in case-a, human-1 is positioned at 30 • and 9 m distance, human-2 is positioned at 60 • and 11 m distance, human-3 is positioned at 90 • and 13 m distance, human-4 is positioned at 120 • and 15 m distance, and human-5 is positioned at 150 • and 17 m distance from the radar. Additionally, a drone is positioned at 0 • and 5 m distance, and a car is positioned at 0 • and 19 m distance. During all measurements, the raw IF data are captured from the mmWave radar. The raw radar data are then processed using MATLAB, and the details can be accessed at [33]. Range-angle heatmaps are created for all measurement scenes [30]. The generation of the angle axis for the range-angle maps is briefly described here.
A programmable rotor is attached to a radar to cover certain FoV, θ FoV , in T seconds. The radar transmits N f frames to cover the same FoV, θ FoV in T seconds. Thus, this entire θ FoV is divided into angle bins, denoted by θ bin .
The angle bins (θ bin ), total FoV (θ FoV ), and total number of frames (N f ) are related by Equation (8): In the outdoor measurements, we set θ FoV to 180 • and N f is set to 800, then each frame corresponds to 0.225 • , i.e., 4.44 frames per degree. A range-angle heat map is then plotted using the range profiles. Such range-angle heat maps for the measurement of case-e and measurement of case-f can be seen in Figures 3 and 4, respectively.

Range and Angle Estimation Using Morphological Operators on Range-Angle Maps
Image processing techniques were used to process the range-angle map images after obtaining them. The flowchart in Figure 5 depicts the various processing steps.

•
Because four receivers were used here, the data set was divided into four sets, one for each case, and four different receivers capturing it. The images were then processed one by one. However, only 1 Tx and 1 Rx are required for angle estimation using the proposed method. • The image was cropped off the scale using Otsu thresholding, and objects were displayed based on the most definite contour, which is the largest in area. • An image was then divided into three channels, namely BGR, stored in a list, and converted into gray scale images. Individual channels were then processed. • To smooth out the image, Gaussian blurring was used, followed by Otsu thresholding, to remove noise and binarize it. • After obtaining the binary image, inversion based on the number of white and black pixels was performed, followed by the morphological operation, closing with a 10 × 10 elliptical structuring element to obtain proper contours. Any areas with a size smaller than 150 px*px were removed. • The best two of the three channels were then chosen, and their intersection was used to generate the final processed image. Later, only contours with a common area in at least three of the four images were kept, and the best contours based on the number of objects were chosen. • Finally, using the concept of moments, the centroids were plotted.

Otsu Thresholding
Image segmentation employs the use of thresholding. It is used to turn a grayscale image into a binary image. Its algorithm operates in such a way that it replaces each pixel if its intensity is less than a fixed constant T threshold value. This value is determined in Otsu thresholding so that the weighted within-class variance can be obtained [34]. The relation is as follows: Unlike global thresholding, which selects any arbitrary value as a threshold value, Otsu thresholding involves looping through all possible threshold values and fixing the value whose value is the minimum by calculating the spread through the above formulae on both the foreground and background sides of pixels [34].
In this case, Otsu thresholding is used to remove the scaling from the original image so that processing can take place. Later, Otsu is used to convert the grayscale image to a binary image because the morphological operation can be performed on a binary image with only two pixel intensities to deal with. The resulting image can be seen in Figure 6.

Gaussian Blurring
The pixels closest to the center are given the most weightage in Gaussian blurring. A group of pixels, referred to as a kernel, is slid along the pixel that needs to be filtered. The weighted average of pixel intensities is calculated to apply this filter [35]. The Gaussian blur filter is simply convolving the image with a Gaussian function, as shown below: Here, x is the horizontal distance from the origin, y is the vertical distance from the origin, and σ is the Gaussian distribution's standard deviation. The values of these Gaussian functions combine to form a convolution matrix, which is then used to convolve the original image [36]. The weighted mean of each pixel's neighbor is then used to replace it. After obtaining three channel images as shown in Figure 7, Gaussian blurring is used to remove noise. Then, the filtered image is thresholded and converted into a binary image. Figure 7. The left image is the one obtained after converting into each channel and the right one is the Gaussian blurred image (case-"n").

Morphological Operation-Closing
Closing is a morphological operator derived from erosion and dilation [37]. It is typically used on binary images. It enlarges the boundaries of images' bright parts and fills gaps in such areas. It is dilation followed by erosion, and it keeps the areas that have a similar shape to the structuring element, while removing the other pixels [37].
Dilation fills holes in images, which can lead to pixel distortion; erosion reduces this. Finally, a structuring element is taken and moved across the image (outside the foreground region). If the SE touches the point and is not part of the foreground region, then that region becomes the background; otherwise, it becomes the foreground [37]. Closing can be mathematically represented as follows: Following the application of the closing operation, other bounds are applied to the image, such as classification and selection based on the contour area (the minimum bound is selected as 150 px*px based on observation), and then the best channel and contour intersection as explained in the algorithm. Finally, long horizontal lines with a width bound of 46 are removed, and the resultant processed image is shown in Figure 8.

Concept of Moments
Moments of a function are quantitative characteristics of a function's shape in mathematics [38]. In this case, we used the zeroth moment to represent mass, which when divided by the total mass yields the center of mass. Contours are laminas, and their geometric center is a centroid; because we assumed that the density of a lamina is constant [39], their center of mass and centroid coincide. Following that, in order to obtain proper contours (laminae), we drew a parallel between the intensity of pixels and the mass of an object, and then divided by the total mass to obtain the center of mass, which is the centroid of the lamina or contour. We employed the bounding rectangle concept [39] because we assumed the lamina to be a rectangle such that the contour is completely enclosed and touches the boundaries of the rectangle. Its centroid is calculated using the formulae given below. Moments are calculated as follows: The COM or the centroid in this case is computed as follows: Here,x andȳ represent the center of mass of the x coordinate and y coordinate of the contour, respectively. The centroid of each object in the image is shown in Figure 9.

Results and Discussion
Using all image processing techniques, including Gaussian blurring, Otsu thresholding, inversion, closing, and center of moments, we calculated the centroid of each contour, which represents the location of an object. The centroid's (x,y) coordinates represent (range, angle). Graphs were formulated to show the spread of values across a specific range/angle. How close the actual and estimated values were are shown in the Bland-Atman plot.
The spread of values in range is almost negligible, as demonstrated by Figures 10 and 11 which depict the spread of angle values; it is observed that the spread is high at the edges, i.e., at 0 and 180 degrees, because the radar angle should have started from a negative angle to obtain accurate values, but it was started from 0 degrees, resulting in higher variance at the extreme values of the angles. The x-axis in Figures 10 and 11 represents the actual measured ranges/angles, and the y-axis represents the obtained values of ranges using the image processing techniques of radar images.

Statistical Analysis of Measurements
In the literature, two kinds of measurement accuracy evaluation techniques were used to find the closeness between the actual values and the measured values (or estimated values): (i) the Pearson correlation (PC) coefficient with linear regression parameters computed for the actual and measured values, and (ii) the Bland-Altman plot with a set of benchmark metrics, such as bias, standard deviation (SD), limit of agreement (LOA), and Bland-Altman ratio (BAR). If the PC value is close to 1, then the measured values nearly equal the actual values. If the BAR value is <10%, then the agreement between the actual and measured time series is good. The agreement is moderate if 10% < BAR ≤ 20%, and is insufficient if the BAR > 20%.  The Pearson correlation (PC) coefficient is computed as follows: wherex,ȳ denote the mean of actual and measured values, respectively. The bias, standard deviation (SD), limit of agreement (LOA), acceptance limit (AL), and Bland-Altman ratio (BAR) metrics are computed from the Bland-Altman plot by using following formulae [40]: Bias indicates a mean shift in the measured values relating to the actual values.
The SD denotes the differences between the actual (x) and measured (y) values.
LOA indicates the agreement limits.
The BAR parameter relates SD to the acceptance limit (AL).
From the error analysis shown in Table 3 and Figure 12, it is observed that proposed measurement method has a range bias of −0.4142 with 95% agreement limits of [−0.7945, −0.0339] m and the angle bias of 3.617 with 95% agreement limits of [−9.087, 16.32] degrees. For the same error analysis, the Pearson correlation coefficient is 99.96 for the range measurements, indicating better correlation between the measured and actual values. The proposed method has a BAR value of 2.76%, which indicates very good agreement between the measured and actual range values. It is further noticed that the method has a BAR value of 14.23% for the measurement of angles, although the Pearson correlation coefficient is 99.54; therefore, we further investigated the angle measurements for different error ranges. From the angle measurement error analysis, we noticed that the method has an angle error of <3 • for 29 targets, 3-6 for 29 targets, 6-10 for 45 targets, >10 for 17 targets. For the 18 targets with zero angle, the method has an angle error of >8 • for 17 targets. The statistical analysis results in Table 3 and Figure 12 show that the bias, LOA and BAR values of the proposed measurement method have a high degree of agreement between the actual and estimated range and angle values. Considering the results and deviation, it is possible to conclude that the proposed image processing method, which employs morphological operator closing to obtain definite contours, is accurate on almost all ranges and angles, except the extremes. This image processing technique is lightweight and does not necessitate a large amount of computation.
The proposed model's complexity and performance merits are compared to [20,[41][42][43][44]. Table 4 shows that the proposed model computational complexity is similar to [44] and has a large FoV in both azimuth and elevation when compared to the rest of the designs and/or algorithms that were reported. However, the work in [44] considered only human targets, whereas the proposed work considered both on-road and aerial targets simultaneously. To estimate the angle, only one Tx and one Rx antenna were required in the proposed concept. Furthermore, there is no limit to the number of targets that can be detected. Table 3. Results of measurement error analysis using the Pearson correlation coefficient, and the bias, SD, and BAR metrics obtained from Bland-Altman analysis a .

Conclusions
We present a novel AoA estimation technique based on mechanical rotation using mmWave FMCW radars in this paper. The proposed method estimates the AoA of multiple targets using only a single transmitter and receiver. The measurements were taken in real-world scenarios involving pedestrians, a car, and an UAV. Based on the measurements and collected radar data, range-angle maps are created, and morphological operators were used to estimate the AoA of the multiple targets. Furthermore, we demonstrated radar range-angle images for improved visual representation. The proposed method will be extremely beneficial for ground stations, traffic control and monitoring applications for both on-ground and airborne vehicles. The cross range resolution is an interesting study. However, this requires a large number of outdoor experiments. We plan to continue our work in the future by taking more outdoor measurements.  Data Availability Statement: The detailed data set is available at https://github.com/wilsonan/ mmWave_RangeAngle_Dataset (accessed on 12 October 2021).